TargetDB | Sequence Redundancy Analysis
The number of unique (non-redundant) targets and structures in TargetDB is calculated monthly by clustering using the BLASTClust program.
BLASTClust program thresholds:
- -S similarity threshold is set as percent of identical residues: 100, 90, 70, 50, or 30 percent.
- -L minimum length coverage is set to default: 0.9
- -b is set to F(false) indicating that coverage specified by -S and -L thresholds is required on only one sequence of a pair
Protein sequences fewer than 20 amino acids are excluded from clustering.
Target sequence redundancy calculations are based on the comparison to all protein sequences in TargetDB that are in the same experimental status category.
Sequence redundancy calculations for structures released in the PDB are based on comparison to all protein sequences in the PDB.
