Testing of Streaming Data Clustering Algorithm Effectiveness

Main Article Content

Anis Fuatovich Galimyanov
Nurgayaz Farhatovich Garifyanov
Chulpan Bakievna Minnegalieva


This article describes the task of streaming data clustering. The task of streaming data processing becomes more and more urgent with the device number increase that produces and process new data. Such devices create endless streams of data at tremendous speed. This article gives the examples of such data streams and the rationale for their processing
need. Cluster flow analysis algorithms differ from classical algorithms due to RAM limitations of a computing device. Both artificial data sets and experimental observations were chosen for stream algorithm testing. The data of chemical gas sensors, as well as information about network connections in the local network, were chosen as such observations. Means and tools were chosen for comparisons between the algorithms. For these purposes, the WEKA and Massive Online Analysis software packages were selected. The article describes the process of working with this software. The data preprocessing process is demonstrated using WEKA. Several algorithms have been tested working with data streams. Clustering results were evaluated using an external quality measure. At the end of the work, they presented
the graphs of this indicator changes during flow clustering.

Article Details

How to Cite
Anis Fuatovich Galimyanov, Nurgayaz Farhatovich Garifyanov, & Chulpan Bakievna Minnegalieva. (2020). Testing of Streaming Data Clustering Algorithm Effectiveness. Helix, 10(05), 124-128. Retrieved from https://helixscientific.pub/index.php/home/article/view/235