Tsinghua Science and Technology


link pattern, labeling, partitioning, scalability evaluation


Link patterns are consensus practices characterizing how different types of objects are typically interlinked in linked data. Mining link patterns in large-scale linked data has been inefficient due to the computational complexity of mining algorithms and memory limitations. To improve scalability, partitioning strategies for pattern mining have been proposed. But the efficiency and completeness of mining results are still under discussion. In this paper we propose a novel partitioning strategy for mining link patterns in large-scale linked data, in which linked data is partitioned according to edge-labeling rules: Edges are grouped into a primary multi-partition according to edge labels. A feedback mechanism is proposed to produce a secondary bi-partition according to a quick mining process. Local discovered link patterns in partitions are then merged into global patterns. Experiments show that our partition strategy is feasible and efficient.


Tsinghua University Press