|Title||Optimal protein structure alignments by multiple linkage clustering: application to distantly related proteins.|
|Publication Type||Journal Article|
|Year of Publication||1995|
|Authors||Boutonnet, N. S., Rooman M. J., Ochagavia M. E., Richelle J., and Wodak S. J.|
|Date Published||1995 Jul|
|Keywords||Algorithms, Amino Acid Sequence, Consensus Sequence, Models, Chemical, Molecular Sequence Data, Protein Conformation, Sequence Alignment|
A fully automatic procedure for aligning two protein structures is presented. It uses as sole structural similarity measure the root mean square (r.m.s.) deviation of superimposed backbone atoms (N, C alpha, C and O) and is designed to yield optimal solutions with respect to this measure. In a first step, the procedure identifies protein segments with similar conformations in both proteins. In a second step, a novel multiple linkage clustering algorithm is used to identify segment combinations which yield optimal global structure alignments. Several structure alignments can usually be obtained for a given pair of proteins, which are exploited here to define automatically the common structural core of a protein family. Furthermore, an automatic analysis of the clustering trees is described which enables detection of rigid-body movements between structure elements. To illustrate the performance of our procedure, we apply it to families of distantly related proteins. One groups the three alpha + beta proteins ubiquitin, ferredoxin and the B1-domain of protein G. Their common structure motif consists of four beta-strands and the only alpha-helix, with one strand and the helix being displaced as a rigid body relative to the remaining three beta-strands. The other family consists of beta-proteins from the Greek key group, in particular actinoxanthin, the immunoglobulin variable domain and plastocyanin. Their consensus motif, composed of five beta-strands and a turn, is identified, mostly intact, in all Greek key proteins except the trypsins, and interestingly also in three other beta-protein families, the lipocalins, the neuraminidases and the lectins. This result provides new insights into the evolutionary relationships in the very diverse group of all beta-proteins.
|Alternate Journal||Protein Eng.|