heterogeneous information network, data mining, clustering, nonnegative matrix tri-factorization
Heterogeneous Information Networks (HINs) contain multiple types of nodes and edges; therefore, they can preserve the semantic information and structure information. Cluster analysis using an HIN has obvious advantages over a transformation into a homogenous information network, which can promote the clustering results of different types of nodes. In our study, we applied a Nonnegative Matrix Tri-Factorization (NMTF) in a cluster analysis of multiple metapaths in HIN. Unlike the parameter estimation method of the probability distribution in previous studies, NMTF can obtain several dependent latent variables simultaneously, and each latent variable in NMTF is associated with the cluster of the corresponding node in the HIN. The method is suited to co-clustering leveraging multiple metapaths in HIN, because NMTF is employed for multiple nonnegative matrix factorizations simultaneously in our study. Experimental results on the real dataset show that the validity and correctness of our method, and the clustering result are better than that of the existing similar clustering algorithm.
Hu, Juncheng; Xing, Yongheng; Han, Mo; Wang, Feng; Zhao, Kuo; and Che, Xilong
"Nonnegative Matrix Tri-Factorization Based Clustering in a Heterogeneous Information Network with Star Network Schema,"
Tsinghua Science and Technology: Vol. 27:
2, Article 14.
Available at: https://dc.tsinghuajournals.com/tsinghua-science-and-technology/vol27/iss2/14