Technical Highlight - November 2013
Short description: The Borges algorithm uses tertiary structural information from small structural fragments in the PDB to obtain phases for X-ray diffraction data.
The “phase problem” in macromolecular crystallography occurs because X-ray diffraction intensities but not their phases are recorded during the diffraction experiment, and the phases are needed to determine the structure. Heavy atom derivatives and/or anomalous scatterers are often used to obtain initial phases, but these experiments can prove time consuming and often unsuccessful.
Phasing by molecular replacement (MR), on the other hand, uses a related structural model to obtain the best fit to experimental diffraction data within the unit cell. However, MR is limited by the requirement of a rather complete and closely related search model. Now, Usón and colleagues have developed Borges, a new algorithm and software tool that makes use of nonspecific tertiary structure fragment information from the PDB to obtain phase diffraction data.
Borges starts with a model template, which can be any representative of a small local fold, such as two parallel or antiparallel helices, three-stranded parallel or antiparallel beta sheets, or mixed strand and helical fragments. The template is decomposed into secondary structure fragments that are described by the distribution of characteristic vectors (CVs). CVs are the centroids of α-carbons and carbonyl oxygens of consecutive, overlapping tripeptides. For an α-helix, the CVs are roughly parallel to the helical axis, and for β-sheets, they point approximately 45° off the direction of the peptide chain. Using the CVs, Borges determines geometric relations between secondary structure fragments and, after also ascertaining tertiary structure relationships, uses this information to search the PDB and generate a library with models that have similar geometry to the template. The library (which can comprise tens of thousands of models) is further clustered based upon geometric similarity, and a representative model is obtained for each cluster.
The phasing procedure is adaptable to individual needs dictated by the crystallographic space group, but follows the general workflow outlined below. Cluster representatives are examined by the fast rotation function at low resolution (3 Å) with Phaser and common rotation solutions are grouped and ranked according to their rotation figures of merit. Brute-force rotation (at high resolution) is performed around the clustered rotation, followed by rigid-body refinement of individual secondary structure fragments against the rotation function, fast translation, brute-force translation to optimize positioning, filtering for packing and another round of refinement. Initial correlation coefficients are analyzed to select the most promising partial models. If any of these models produce an interpretable map after density modification, SHELXE is used to autotrace the remaining structure.
Because this method relies on short structural fragments and uses multiple representative models, it has success where searching by a single, larger model may fail. Indeed, the authors solved both test-case and unknown structures of all-helical, all-beta and mixed alpha/beta proteins. Borges makes excellent use of the vast amount of structural information housed within the PDB.
M. Sammito et al. Exploiting tertiary structure through local folds for crystallographic phasing.
Nat. Methods. (15 September 2013). doi:10.1038/nmeth.2644