KMS Chongqing Institute of Green and Intelligent Technology, CAS
Nonnegative Latent Factor Analysis-Incorporated and Feature-Weighted Fuzzy Double $c$-Means Clustering for Incomplete Data | |
Song, Yan1; Li, Ming1; Zhu, Zhengyu1; Yang, Guisong1; Luo, Xin2,3 | |
2022-10-01 | |
摘要 | Fuzzy c-means (FCM) clustering is a promising method to handle uncertainties in data clustering. However, the traditional FCM and most of its variants cannot address incomplete inputs. To this aim, a novel fuzzy clustering framework is put forward to perform highly accurate clustering on incomplete data. It adopts twofold ideas: 1) Utilizing a nonnegative latent factor model to prefill the missing data in the inputs by rigidly extracting involved entities' latent features, where the principle of a minibatch gradient descent algorithm is incorporated into a single latent factor-dependent, nonnegative and multiplicative update algorithm to accelerate the convergence rate; and 2) integrating the distribution of inputs and the weights of local features into the objective function through sparse self-representation and weighting allocation to focus on crucial features. In this way, a NLF analysis-incorporated and feature-weighted fuzzy double c-means clustering ((NFD)-D-2) method is achieved, where the data distribution and instance correlation are simultaneously considered with care. Experiments on 12 real-world datasets including both data and images with different missing rates show that the proposed (NFD)-D-2 method has a significant superiority over state-of-the-art fuzzy clustering methods. |
关键词 | Big data clustering fuzzy double c-means incomplete data latent factor analysis local feature weights. |
DOI | 10.1109/TFUZZ.2022.3144489 |
发表期刊 | IEEE TRANSACTIONS ON FUZZY SYSTEMS |
ISSN | 1063-6706 |
卷号 | 30期号:10页码:4165-4176 |
通讯作者 | Luo, Xin(lu-oxin21@cigit.ac.cn) |
收录类别 | SCI |
WOS记录号 | WOS:000864186200015 |
语种 | 英语 |