Big Data Mining and Analytics


benchmark, symmetric protein


Symmetric proteins play important roles in many biological processes, such as signal transduction and molecular transportation. Therefore, determining the symmetric oligomeric structure of subunits is crucial to investigate the molecular mechanism of the related processes. Due to the high cost and technical difficulties associated with many experimental methods, computational approaches, such as molecular docking, have played an important complementary role in the determination of symmetric complex structures, in which a benchmark data set is pressingly needed. In the present work, we develop a comprehensive and non-redundant benchmark for symmetric protein docking based on the structures in the Protein Data Bank (PDB). The diverse dataset consists of 251 targets, including 212 cases with cyclic groups symmetry, 35 cases with dihedral groups symmetry, 3 cases with cubic groups symmetry, and 1 case with helical symmetry. According to the conformational changes in the interface between bound and unbound structures, the 251 targets were classified into three groups: 176 "easy", 37 "medium", and 38 "difficult" cases. A preliminary docking test on the targets of cyclic groups symmetry with M-ZDOCK indicated that symmetric multimer docking remains challenging. The benchmark will be beneficial for the development of symmetric protein docking algorithms. The proposed benchmark data set is available for download at http://huanglab.phys.hust.edu.cn/SDBenchmark/.


Tsinghua University Press