CSpace  > 大数据挖掘及应用中心
Fat node leading tree for data stream clustering with density peaks
Xu, Ji1,3,4; Wang, Guoyin2; Li, Tianrui1; Deng, Weihui3; Gou, Guanglei1
2017-03-15
摘要

Detecting clusters of arbitrary shape and constantly delivering the results for newly arrived items are two critical challenges in the study of data stream clustering. However, the existing clustering methods could not deal with these two problems simultaneously. In this paper, we employ the density peaks based clustering (DPClust) algorithm to construct a leading tree (LT) and further transform it into a fat node leading tree (FNLT) in a granular computing way. FNLT is a novel interpretable synopsis of the current state of data stream for clustering. New incoming data is blended into the evolving FNLT structure quickly, and thus the clustering result of the incoming data can be delivered on the fly. During the interval between the delivery of the clustering results and the arrival of new data, the FNLT with blended data is granulated as a new FNLT with a constant number of fat nodes. The FNLT of the current data stream is maintained in a real-time fashion by the Blending-Granulating-Fading mechanism. At the same time, the change points are detected using the partial order relation between each pair of the cluster centers and the martingale theory. Compared to several state-of-the-art clustering methods, the presented model shows promising accuracy and efficiency. (C) 2016 Elsevier B.V. All rights reserved.

关键词Data Stream Clustering Density Peaks Fat Node Leading Tree Change Point
DOI10.1016/j.knosys.2016.12.025
发表期刊KNOWLEDGE-BASED SYSTEMS
ISSN0950-7051
卷号120页码:99-117
收录类别SCI
WOS记录号WOS:000395213300009
语种英语