报告题目：Finding Remote Homologous Proteins: Alignment-Based, Alignment-Free and Cross-Modal Methods
Abstract:Proteins function in living organisms as enzymes, antibodies, sensors, and transporters, among myriad other roles. The understanding of protein functions has great implications for the study of biological and medical sciences. It has been widely accepted that protein functions are largely determined by protein structures, and proteins with similar sequences tend to fold into similar structures. Moreover, protein structures are more conserved than protein sequences over the course of evolution. Therefore, finding remote homologous proteins with conserved structure similaritiesbut limited sequence similarities becomes a fundamental yet challenging problem in computational biology. Indeed, this is an indispensable step towards understanding protein functions.
Here, three different novel methods are presented for finding remote homologous proteins with different goals: (a) the PROteinSTructure Alignment (PROSTA) methods that automatically determine and align homologous structures of protein pockets and interaction interfaces; (b) the ContactLib method that scans tens of thousands of protein structures for homologous structures in seconds; and (c) the CMsearch method that simultaneously explore the sequence space and the structure space to perform cross-modal search for homologous proteins. Experiments show that our methods do not only improve the accuracy of finding homologous proteins, but also improve the accuracy ofpredicting protein structures. Moreover, case studies where our methods discover, for the first time, structural similarities between pairs of functionally related protein-DNA complexes are presented.
Biography:崔学峰博士，在加拿大滑铁卢大学(University of Waterloo) 先后获得计算机本科、硕士、博士学位。2014年博士毕业后，在沙特阿拉伯阿卜杜拉国王科技大学 (King Abdullah University of Science and Technology, KAUST) 承担了两年的博士后工作。2016年起回国工作，在清华大学交叉信息研究院任Tenure-Track助理教授职务。主要科研方向为机器学习与并行计算方法在蛋白质结构生物信息学和系统生物信息学中的应用。在生物信息学顶级期刊Bioinformatics, Nucleic Acid Research (NAR), ACM Synthetic Biology和生物信息学顶级会议Intelligent Systems for Molecular Biology (ISMB)发表论文十余篇。