Statistics Summary Report for PSI Centers
Last updated: Feb 2 2012
PSI-2 Centers:
|ATCG3D| |CESG| |CHTSB| |CSMP| |ISFI| |JCSG| |MCSG| |NESG| |NYCOMPS| |NYSGXRC|
PSI-1 Centers:
- Target Status Statistics
- Table 1: Status Statistics for PSI Centers
- Figure 1: Target Experimental Status for PSI Centers
- Table 2: Status Statistics for PSI Centers by Organisms
- Figure 2: Source Organisms at PSI Centers
- Deposited Structure Statistics
- Table 3: PDB Status Statistics for Structures from PSI Centers
- Figure 3: Structures Released by PSI Centers by Year
- Table 4: List of Structures Deposited by PSI Centers in the PDB
- Sequence Redundancy Statistics
Target Status Statistics
Total number of targets deposited by PSI Centers to TargetDB: 260303
Table 1: Status Statistics for PSI Centers
| Status | Total Number of Targets | (%) Relative to "Cloned" Targets | (%) Relative to "Expressed" Targets | (%) Relative to "Purified" Targets | (%) Relative to "Crystallized" Targets |
| Cloned | 176125 | 100.0 | - | - | - |
| Expressed | 107113 | 60.8 | 100.0 | - | - |
| Soluble | 26015 | 14.8 | 24.3 | - | - |
| Purified | 36145 | 20.5 | 33.7 | 100.0 | - |
| Crystallized | 10637 | 6.0 | 9.9 | 29.4 | 100.0 |
| Diffraction-quality Crystals | 4850 | 2.8 | 4.5 | 13.4 | 45.6 |
| Diffraction | 5148 | 2.9 | 4.8 | 14.2 | 48.4 |
| NMR Assigned | 816 | 0.5 | 0.8 | 2.3 | - |
| HSQC | 1917 | 1.1 | 1.8 | 5.3 | - |
| Crystal Structure | 2844 | 1.6 | 2.7 | 7.9 | 26.7 |
| NMR Structure | 724 | 0.4 | 0.7 | 2.0 | - |
| In PDB1 | 5166 | 2.9 | 4.8 | 14.3 | 42 |
| Work Stopped | 94254 | - | - | - | - |
| Test Target | 102 | - | - | - | - |
| Other | 8179 | - | - | - | - |
Last updated: Feb 2 2012
Note 1: Number of targets with status "in PDB" may not be equal to number of structures determined by a project. A target may reference several PDB IDs (example: structure of the same polypeptides with different ligands). Multiple targets in TargetDB may identify the same PDB structure when a stucture is a result of collaboration between different centers and each center includes the target on its target list.
Figure 1: Target Experimental Status for PSI Centers

Last updated: Feb 2 2012
This graph is normalized relative to number of cloned targets in TargetDB.
Targets that progressed to status "Cloned" constitute 68% of TargetDB.
Table 2: Status Statistics for PSI Centers by Organism
These statistics is derived from mapping of target sequences to GenBank using
>=98% sequence identity cut off.
| Organism | Total Number1 | Work Stopped | Cloned | Expressed | Purified | Crystallized | Crystal Structure | NMR Structure | In PDB |
| Viruses | 908 | 456 | 499 | 394 | 199 | 34 | 16 | 13 | 39 |
| Archaea | 16440 | 6707 | 12346 | 7504 | 2788 | 705 | 174 | 62 | 402 |
| Bacteria | 163578 | 64342 | 117372 | 74443 | 27894 | 8674 | 2306 | 514 | 4175 |
| Prokaryota | 180013 | 71048 | 129715 | 81944 | 30681 | 9379 | 2480 | 575 | 4576 |
| Yeast | 2856 | 1428 | 2188 | 1500 | 956 | 103 | 32 | 12 | 56 |
| Plasmodium | 4881 | 424 | 2732 | 1181 | 185 | 61 | 17 | 0 | 17 |
| Trypanosoma | 5304 | 83 | 3443 | 1755 | 290 | 57 | 9 | 0 | 9 |
| Leishmania | 8681 | 271 | 4184 | 2122 | 371 | 140 | 21 | 0 | 17 |
| Arabidopsis | 7745 | 6517 | 3781 | 1095 | 262 | 82 | 35 | 21 | 57 |
| Rice | 170 | 134 | 144 | 73 | 12 | 4 | 1 | 0 | 1 |
| Worm | 14504 | 3235 | 12252 | 5570 | 464 | 122 | 26 | 4 | 37 |
| Drosophila | 1088 | 579 | 289 | 246 | 39 | 5 | 4 | 3 | 7 |
| Mouse | 8078 | 2772 | 4761 | 3383 | 823 | 244 | 65 | 34 | 101 |
| Human | 21565 | 6585 | 10139 | 7655 | 1670 | 359 | 115 | 70 | 193 |
| Eukaryota | 72723 | 21227 | 42055 | 23048 | 4844 | 1108 | 323 | 124 | 495 |
| Uncultured or unidentified | 340 | 118 | 237 | 138 | 88 | 28 | 4 | 0 | 17 |
Last updated: Feb 2 2012
Note 1:
Total counts in this table may differ from total number of targets and structures.
A target is counted in different organism specifications if:
- a target is mapped to different organisms
- a target is a hybrid complex (for example:a complex of human and mouse polypeptides).
Figure 2: Source Organisms in PSI Centers

Last updated: Feb 2 2012
back to topDeposited Structure Statistics for PSI Centers
Number of Released X-Ray Structures: 5213
Number of Released NMR Structures: 578
Total number of released structures from PSI Centers in the PDB: 5791
Table 3: PDB Status Statistics for Structures from PSI Centers
| PDB Status | ATCG3D | BSGC | CESG | CHTSB | CSMP | GPCR | ISFI | JCSG | MCSG | MPID | MPP | MPSBC | MPSBYNMR | NESG | NYCOMPS | NYSGRC | NYSGXRC | SECSG | SGPP | TB | TEMIMPS | TMPC | TRANSPORTPDB | Total |
| Total Deposited | 22 | 88 | 154 | 16 | 12 | 5 | 124 | 1256 | 1468 | 0 | 0 | 0 | 1 | 1022 | 26 | 82 | 1032 | 92 | 41 | 617 | 0 | 0 | 0 | 6042 |
| Released | 22 | 88 | 152 | 16 | 12 | 5 | 13 | 1250 | 1463 | 0 | 0 | 0 | 1 | 1020 | 26 | 81 | 1032 | 92 | 41 | 493 | 0 | 0 | 0 | 5791 |
| In Process | 0 | 0 | 2 | 0 | 0 | 0 | 111 | 6 | 4 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 124 | 0 | 0 | 0 | 249 |
| Last updated: Feb 2 2012 |
| Note 1: "Total Deposited" are all structures in the PDB including structures released to the public and structures that are in the process to be released ("Released on Publication" , "Released on Certain Date", etc.). |
| Note 2: Some PDB IDs are cross referenced by different centers. Therefore difference between "Total" number of structures and direct sum of number of structures from individual centers can be observed. |
Figure 3: Structures Released by PSI Centers by Year
Sequence Redundancy Statistics
Table 5: Sequence Redundancy Statistics for PSI Centers by Experimental Status
| Sequence Identity(%) | Novel Targets
Status: Selected |
Novel Targets Status: Cloned |
Novel Targets Status: Expressed |
Novel Targets Status: Purified |
Novel Targets Status: Crystallized |
Novel Targets Status: Crystal Structure | Novel Targets Status: NMR Structure | Novel Targets Status: in PDB |
| <100 | 184506 | 131815 | 82968 | 30940 | 9799 | 2712 | 713 | 5024 |
| <98 | 206649 | 128553 | 81163 | 30396 | 9610 | 2693 | 712 | 4997 |
| <95 | 174766 | 126782 | 80168 | 30126 | 9562 | 2688 | 711 | 4990 |
| <90 | 170589 | 124432 | 78874 | 29791 | 9514 | 2678 | 711 | 4977 |
| <70 | 150586 | 112257 | 71923 | 28110 | 9207 | 2615 | 707 | 4894 |
| <50 | 116834 | 90586 | 59158 | 24514 | 8591 | 2522 | 687 | 4687 |
| <40 | 91306 | 72913 | 48468 | 21309 | 7910 | 2410 | 669 | 4479 |
| <30 | 55605 | 47002 | 32492 | 15606 | 6612 | 2201 | 642 | 3945 |
| Last updated: 12-01-10 |
| Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity. Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings. Sequence redundancy calculations are based on comparison to all protein sequences in TargetDB which are in the same experimental status category and at least 20 amino acids long. |
Table 6: Sequence Redundancy Statistics for Structures Released by PSI Centers by Year
| Year | Released Structures | Number of Released Structures <30% Identity at Time of Release | Percent(%) of Released Structures <30% Identity(%) at Time of Release |
| <= 2000 | 59 | 23 | 39 |
| 2001 | 47 | 17 | 36 |
| 2002 | 113 | 41 | 36 |
| 2003 | 229 | 96 | 42 |
| 2004 | 558 | 244 | 44 |
| 2005 | 496 | 247 | 50 |
| 2006 | 692 | 359 | 52 |
| 2007 | 760 | 430 | 57 |
| 2008 | 751 | 440 | 59 |
| 2009 | 910 | 439 | 48 |
| 2010 | 741 | 326 | 44 |
| 2011 | 398 | 148 | 37 |
| 2012 | 25 | 6 | 24 |
| Total | 5779 | 2816 | 49 |
| Last updated: 12-02-02 |
| Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity. Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings. Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long. |
Figure 4: Comparison of Novel Structures with Number of Structures Released by PSI Centers by Year
| Note 1: Last updated: 12-02-02 |
| Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity. Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings. Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long. |
