Journal of Shanghai University(Natural Science Edition) ›› 2024, Vol. 30 ›› Issue (3): 545-558.doi: 10.12066/j.issn.1007-2861.2462

Previous Articles     Next Articles

An optimization method based on support vector machine for Ramachandran plot in protein structures annotation

WANG Bo1, SU Tianhao2, XU Yanting1, GAO Heng1,GUO Cong1, LI Yongle1, WU Wei1   

  1. 1. International Centre for Quantum and Molecular Structures, College of Sciences, Shanghai University, Shanghai 200444, China; 2. Materials Genome Institute, Shanghai University, Shanghai 200444, China
  • Online:2024-06-30 Published:2024-07-09

Abstract: The Ramachandran plot is among the most central concepts for validating the conformation of protein structures, and accordingly plays an important role in structural biology. However, the favored regions defined when using the traditional Ramachandran plot are too wide and contain inaccurate structures. To address these deficiencies, a method based on support vector machine (SVM) and Bayesian optimization (SVM-Rama) for optimization and subdivision of the definition of favored regions for the Ramachandran plot is proposed. Aims in this study are to enhance the accuracy of the favored regions for the specific secondary structure species of proteins, and subsequently to validate and annotate protein secondary structures simply and accurately. The results reveal that the optimized plot has a high accuracy comparable to the best performance of traditional methods in secondary structure annotation, while facilitating analysis at lower training and computational costs than these traditional methods.

Key words: Ramachandran plot, Support vector machine, structure annotation of proteins