Technical Highlight - June 2015
Short description: A combination of sequence and structure information is used to design new repeat proteins.
The beautiful architectural symmetry seen in proteins that contain repetitive structures underlies their potential as therapeutic and experimental tools. However, our ability to engineer repeat proteins has been hampered by the limited sequence dependence of their individual modules. Now, Baker and colleagues (PSI NESG) have used computational protein design methods to make advances in this area.
The starting point was the assumption that the sequence alignment–based structure prediction of proteins with repeat domains is not sufficient. Therefore, the authors combined this sequence information with computational de novo structure generation using the Rosetta modelling software. Using the sequence alignment information to bias the structure generation, the program was able to output novel sequences for each of the six protein families tested—ankyrin, armadillo, tetratrico peptide repeat, HEAT, leucine-rich repeat (LRR) and WD40. Having predicted the structure of these proteins, synthetic peptides were prepared, and the proteins expressed in Escherichia coli and assessed using standard biophysical techniques and, where possible, X-ray crystallography.
The prototypical example for this structure prediction method was the LRR protein family. The LRR domains share a common structural motif and a horseshoe-shaped structure that allows binding partners to interact, usually to the inside concave surface. The proteins can have a range of different curvatures, and the potential of protein structure design to manipulate both this curve and the associated binding site is considerable.
LRRs are made up of repeats of a β-strand packed on an α-helix. Sequence refinement identified the conserved residues, and Rosetta was used to design the proteins, with the caps at the end of the protein taken from well-characterized LRR proteins, as they are not part of the repeat segment. Two of the final protein designs were solved using X-ray crystallography, and the structures (PDB 4PSJ and PDB 4PQ8) were very similar to the predicted models. These data are the first indication that this method could be used to design new repeat protein structures.
In a subsequent article by Baker and colleagues 2 and in an independent study by André and colleagues 3 , bespoke LRR proteins with predefined geometries were successfully created. In both studies, the authors were able to design the building blocks of the protein that could then be assembled to form the desired proteins. The systems developed should be applicable to a wide range of repeat proteins, allowing the development of these designer proteins for specific experimental and therapeutic uses.
F. Parmeggiani et al. A general computational approach for repeat protein design.
J. Mol. Biol. 427, 563-575 (2015). doi:10.1016/j.jmb.2014.11.005
K. Park et al. Control of repeat-protein curvature by computational protein design.
Nat. Struct. Mol. Biol. 22, 167-174 (2015). doi:10.1038/nsmb.2938
S. Rämisch et al. Computational design of a leucine-rich repeat protein with a predefined geometry.
Proc. Natl Acad. Sci. USA. 111, 17875-17880 (2014). doi:10.1073/pnas.1413638111