PSI TargetDB

TargetDB Statistics Summary Report

Last updated: Feb 2 2012



Target Status Statistics

Total number of targets deposited by worldwide Contributing Centers in TargetDB: 290110

Table 1: TargetDB Status Statistics

Status Total Number of Targets (%) Relative to "Cloned" Targets(%) Relative to "Expressed" Targets(%) Relative to "Purified" Targets (%) Relative to "Crystallized" Targets
Cloned199435100.0---
Expressed12689363.6100.0--
Soluble3662618.428.9--
Purified4699423.637.0100.0-
Crystallized150177.511.832.0100.0
Diffraction-quality Crystals72903.75.715.548.5
Diffraction77813.96.116.651.8
NMR Assigned22871.11.84.9-
HSQC34711.72.77.4-
Crystal Structure51992.64.111.134.6
NMR Structure21621.11.74.6-
In PDB188784.57.018.945
Work Stopped95127-- --
Test Target103-- --
Other11281-- --

Last updated: Feb 2 2012

Note 1:   Number of targets with status "in PDB" may not be equal to number of structures determined by a project. A target may reference several PDB IDs (example: structure of the same polypeptides with different ligands). Multiple targets in TargetDB may identify the same PDB structure when a stucture is a result of collaboration between different centers and each center includes the target on its target list.

Figure 1: Experimental Status in TargetDB

Last updated: Feb 2 2012

This graph is normalized relative to number of cloned targets in TargetDB.
Targets that progressed to status "Cloned" constitute 69% of TargetDB.

back to top

Table 2: TargetDB Status Statistics by Organism

These statistics are derived from mapping of target sequences to GenBank using >=98% sequence identity cut off.

Organism Total Number1 Work Stopped Cloned Expressed Purified Crystallized Crystal Structure NMR Structure In PDB
Viruses144346386166728466421864
Archaea1835167211420088163909132460863835
Bacteria1846276508813358888169342831154638585695750
Prokaryota2029737180814778596982381911287044666316584
Yeast32311447250317661063120491664
Plasmodium4978425276111941936520323
Trypanosoma556890347917862945810010
Leishmania87662724193212737314221017
Arabidopsis779265183826113830184355389
Rice17613414875124101
Worm14510323512257557546712226639
Drosophila11445793362856721191231
Mouse95882784613647362008481157685837
Human2516766111293210364388786629012311510
Eukaryota795422134746929275078071193866015032170
Uncultured or unidentified4111242961821123510024

Last updated: Feb 2 2012

Note 1:   Total counts in this table may differ from total number of targets and structures. A target is counted in different organism specifications if:
- a target is mapped to different organisms
- a target is a hybrid complex (for example:a complex of human and mouse polypeptides).

Figure 2: Source Organisms in TargetDB

Last updated: Feb 2 2012

back to top

Deposited Structure Statistics

Number of released X-Ray structures reported to TargetDB: 7511

Number of released NMR structures reported to TargetDB: 2009

Number of released Cryo-Electron Microscopy structures reported to TargetDB: 3

Total number of released structures from worldwide Contributing Centers reported to TargetDB: 9523

View list of all structures deposited by worldwide Contributing Centers to the PDB, as reported to TargetDB.

Table 3: PDB Status Statistics for Structural Biology Structures

StatusAll CentersPSI CentersNon-PSI Contributing Centers in North America Contributing Centers in EuropeContributing Centers in Asia
Total Deposited979460429091342713
Released952357918941302713
Release on Publication41200
Release on Certain Date11000
In Process2662491340
Last updated: Feb 2 2012
1:   Some PDB IDs are cross referenced by different centers. Example: PDB_id 106Y is associated with SPINE and TB centers. Therefore difference between number of structures in "ALL Centers" column and direct sum of number of structures from projects/geographical regions can be observed.
2:   "Total Deposited" are all structures in the PDB including structures released to the public and structures that are in the process to be released ("Released on Publication" , "Released on Certain Date", etc.).

Figure 3: Structures Released by Contributing Centers by Year

Last updated: Feb 2 2012

back to top

Sequence Redundancy Statistics

Table 4: TargetDB Sequence Redundancy Statistics by Experimental Status

Sequence Identity(%)Novel Targets
Status:
Selected
Novel Targets
Status:
Cloned
Novel Targets
Status:
Expressed
Novel Targets
Status:
Purified
Novel Targets
Status:
Crystallized
Novel Targets
Status:
Crystal Structure
Novel Targets
Status:
NMR Structure
Novel Targets
Status:
in PDB
<100206402149520984663934313307449421028105
<98227700144873956023831412901431120967909
<95193372142681943013794312832430120917891
<90188106139760925833746212760428120837859
<70164126124799833503488612229415319607587
<5012531198915669292955711072382817826960
<40966047834853572248549793347316136334
<30584084998335250176467863299613425271
Last updated: 12-01-10
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in TargetDB which are in the same experimental status category and at least 20 amino acids long.

Table 5: Sequence Redundancy Statistics for Structures Released by Contributing Centers in the PDB by Year

YearReleased Structures Number of Released Structures <30% Sequence Identity at Time of Release Percent(%) of Released Structures <30% Sequence Identity at Time of Release
<= 20001003232
2001732332
20021715432
200341614134
200495836338
2005106534733
2006116142837
2007161053833
2008110748544
2009113045640
201098634135
201169316624
201241717
Total9511338136
Last updated: 12-02-02
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long.

Figure 4: Comparison of Novel Structures with Number of Structures Released By Contributing Centers

Last updated: 12-02-02
Sequence redundancy is calculated by clustering analysis using BLASTClust program with similarity threshold set to percent of sequence identity.   Please view detailed explanation of sequence redundancy calculations and BLASTClust threshold settings.  Sequence redundancy calculations are based on comparison to all protein sequences in the PDB which are at least 20 amino acids long.
back to top

Summary Statistics Reports by Project or Geographical Region: