PSI TargetDB

Statistics Summary Report for PSI Centers

Last updated: Feb 2 2012


PSI-2 Centers:

|ATCG3D| |CESG| |CHTSB| |CSMP| |ISFI| |JCSG| |MCSG| |NESG| |NYCOMPS| |NYSGXRC|

PSI-1 Centers:

|BSGC| |SECSG| |SGPP| |TB|



Target Status Statistics

Total number of targets deposited by PSI Centers to TargetDB: 260303

Table 1: Status Statistics for PSI Centers

Status Total Number of Targets(%) Relative to "Cloned" Targets(%) Relative to "Expressed" Targets(%) Relative to "Purified" Targets(%) Relative to "Crystallized" Targets
Cloned176125100.0---
Expressed10711360.8100.0--
Soluble2601514.824.3--
Purified3614520.533.7100.0-
Crystallized106376.09.929.4100.0
Diffraction-quality Crystals48502.84.513.445.6
Diffraction51482.94.814.248.4
NMR Assigned8160.50.82.3-
HSQC19171.11.85.3-
Crystal Structure28441.62.77.926.7
NMR Structure7240.40.72.0-
In PDB151662.94.814.342
Work Stopped94254----
Test Target102----
Other8179----

Last updated: Feb 2 2012


Note 1:   Number of targets with status "in PDB" may not be equal to number of structures determined by a project. A target may reference several PDB IDs (example: structure of the same polypeptides with different ligands). Multiple targets in TargetDB may identify the same PDB structure when a stucture is a result of collaboration between different centers and each center includes the target on its target list.

Figure 1: Target Experimental Status for PSI Centers

Last updated: Feb 2 2012

This graph is normalized relative to number of cloned targets in TargetDB.
Targets that progressed to status "Cloned" constitute 68% of TargetDB.

Table 2: Status Statistics for PSI Centers by Organism

These statistics is derived from mapping of target sequences to GenBank using >=98% sequence identity cut off.

Organism Total Number1 Work Stopped Cloned Expressed Purified Crystallized Crystal Structure NMR Structure In PDB
Viruses90845649939419934161339
Archaea164406707123467504278870517462402
Bacteria163578643421173727444327894867423065144175
Prokaryota180013710481297158194430681937924805754576
Yeast2856142821881500956103321256
Plasmodium4881424273211811856117017
Trypanosoma5304833443175529057909
Leishmania86812714184212237114021017
Arabidopsis774565173781109526282352157
Rice17013414473124101
Worm14504323512252557046412226437
Drosophila1088579289246395437
Mouse80782772476133838232446534101
Human215656585101397655167035911570193
Eukaryota7272321227420552304848441108323124495
Uncultured or unidentified34011823713888284017

Last updated: Feb 2 2012

Note 1:   Total counts in this table may differ from total number of targets and structures. A target is counted in different organism specifications if:
- a target is mapped to different organisms
- a target is a hybrid complex (for example:a complex of human and mouse polypeptides).

Figure 2: Source Organisms in PSI Centers

Last updated: Feb 2 2012

back to top

Deposited Structure Statistics for PSI Centers

Number of Released X-Ray Structures: 5213

Number of Released NMR Structures: 578

Total number of released structures from PSI Centers in the PDB: 5791

Table 3: PDB Status Statistics for Structures from PSI Centers

PDB StatusATCG3DBSGCCESGCHTSBCSMPGPCRISFIJCSGMCSGMPIDMPPMPSBCMPSBYNMRNESGNYCOMPSNYSGRCNYSGXRCSECSGSGPPTBTEMIMPSTMPCTRANSPORTPDBTotal
Total Deposited22881541612512412561468000110222682103292416170006042
Released2288152161251312501463000110202681103292414930005791
In Process002000111640000101000124000249
Last updated: Feb 2 2012
Note 1:   "Total Deposited" are all structures in the PDB including structures released to the public and structures that are in the process to be released ("Released on Publication" , "Released on Certain Date", etc.).
Note 2:  Some PDB IDs are cross referenced by different centers. Therefore difference between "Total" number of structures and direct sum of number of structures from individual centers can be observed.

Figure 3: Structures Released by PSI Centers by Year

Last updated: Feb 2 2012

Sequence Redundancy Statistics

Table 5: Sequence Redundancy Statistics for PSI Centers by Experimental Status

Sequence Identity(%)Novel Targets
Status:
Selected
Novel Targets
Status:
Cloned
Novel Targets
Status:
Expressed
Novel Targets
Status:
Purified
Novel Targets
Status:
Crystallized
Novel Targets
Status:
Crystal Structure
Novel Targets
Status:
NMR Structure
Novel Targets
Status:
in PDB
<1001845061318158296830940979927127135024
<982066491285538116330396961026937124997
<951747661267828016830126956226887114990
<901705891244327887429791951426787114977
<701505861122577192328110920726157074894
<50116834905865915824514859125226874687
<4091306729134846821309791024106694479
<3055605470023249215606661222016423945
Last updated: 12-01-10
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in TargetDB which are in the same experimental status category and at least 20 amino acids long.

Table 6: Sequence Redundancy Statistics for Structures Released by PSI Centers by Year

Year Released Structures Number of Released Structures <30% Identity at Time of Release Percent(%) of Released Structures <30% Identity(%) at Time of Release
<= 2000592339
2001471736
20021134136
20032299642
200455824444
200549624750
200669235952
200776043057
200875144059
200991043948
201074132644
201139814837
201225624
Total5779281649
Last updated: 12-02-02
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long.

Figure 4:   Comparison of Novel Structures with Number of Structures Released by PSI Centers by Year

Note 1:  Last updated:  12-02-02
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long.
back to top