PSI Structural Biology Knowledgebase

PSI | Structural Biology Knowledgebase
Header Icons

Related Articles
Microbiome: Expanding the Gut Gene Catalog
November 2014
Complex Search
September 2014
Repairing a Rift
September 2014
iTRAQing the Ubiquitinome
July 2014
Immunity: Clustering Immunoglobulins
June 2014
Mining Protein Dynamics
May 2014
Design and Discovery: Identifying New Enzymes and Metabolic Pathways
January 2014
Epigenetics: Tracing Histone Demethylase Inhibitors
December 2013
Cancer Networks: Predicting Catalytic Residues from 3D Protein Structures
November 2013
Protein-Nucleic Acid Interaction: Inhibition Through Allostery
July 2013
Infectious Diseases: Targeting Meningitis
May 2013
Protein Interaction Networks: Reading Between the Lines
April 2013
Design and Discovery: A Cocktail for Proteins Without ID
February 2013
Targeting Enzyme Function with Structural Genomics
July 2012
More in one
June 2012
Disordered Proteins
February 2012
RNA Chaperone NMB1681
July 2011
Capsid assembly in motion
April 2011
One at a time
April 2011
A growing family
February 2011
Predicting functions within a superfamily
January 2011
Isoxanthopterin Deaminase
November 2010
Scaling up mutational scanning
November 2010
Alpha/Beta Barrels
October 2010
Mre11 Nuclease
May 2010
Assigning protein function: GeMMA
April 2010
Face off
October 2009

Technology Topics Annotation/Function

Cancer Networks: Predicting Catalytic Residues from 3D Protein Structures

SBKB [doi:10.1038/sbkb.2012.170]
Technical Highlight - November 2013
Short description: Identifying the structural features that can predict catalytic amino acids will enhance the functional assignment of unknown proteins in structural databases.

A best-case scenario of functional residue prediction. All six experimentally characterized catalytic residues in a thymidylate synthase (PDB 1LCB) were correctly predicted and are shown in red. Other residues predicted to be functional are in green. Figure courtesy of Andras Fiser.

The rapidly expanding number of structures emerging from structural genomics projects is far outpacing the rate of functional analysis. While the activities associated with new structures are sometimes evident from prior functional characterization of related proteins, 30% of the structures deposited in databases have no functional annotation, reinforcing the need for computational approaches to predict biological roles.

Sequence conservation is among the most powerful indicators of functional relevance, but its predictive power is limited by the extent of conservation and the number of related proteins identified. Because 75% of homologous proteins share less than 30% sequence identity, structural information is additionally required for reliable functional assignment. This type of “hybrid” approach would be strengthened by complementary methods that can identify functional residues within conserved regions. Toward that end, Fajardo and Fiser directly assessed the correlation of features used to distinguish functional residues from their nonfunctional counterparts in order to determine those that most reliably predict catalytic residues from sequence and structural data. This was accomplished by analyzing 439 structures of a training dataset and determining correlations between pairs of attributes ascribed to potential functional amino acids. The features analyzed, which included distance to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), sequence conservation and closeness, and other graph centrality measures, were then used to train neural networks to identify catalytic residues.

In agreement with previous reports, the authors found that sequence conservation displays the highest correlation with function, but additional parameters can be reliable guides to catalytic propensity. Both the distance of residues to the GCM and closeness could distinguish functional and nonfunctional residues; in contrast, RSA shows essentially no correlation to function. The best predictive performance was obtained from networks using distance to the GCM and amino acid type as inputs, and was optimal when residues were preselected based on sequence conservation. This approach out-performed structure-only prediction methods, and also compared favorably with currently employed sequence-based methods. The authors note that the rapidly changing composition of sequence databases requires that sequence conservation be regularly recalculated to ensure the usefulness of sequence profile-based methods. The expanded ability to annotate protein structures for which there are presently no known functions would appear to be worth this effort.

Beth Moorefield


  1. J.E. Fajardo and A. Fiser Protein structure based prediction of catalytic residues.
    BMC Bioinformatics. 14, 63 (2013). doi:10.1186/1471-2105-14-63

Structural Biology Knowledgebase ISSN: 1758-1338
Funded by a grant from the National Institute of General Medical Sciences of the National Institutes of Health