Sequence Redundancy vs. Method
in the Protein Data Bank
April 2007 by Eric Martz for Protein Explorer.
Contributor Method Entries Portion
NMR
Non-Redundant
Sequences1
NOT Structural
Genomics (90%)
X-Ray ~33,392 13% 20%
NMR 5,025 56%2
Structural
Genomics (10%)5
X-Ray 3,189 26% 60%
NMR 1,131 57%
Total X-Ray
+ NMR
42,813 15% 25%3,4
  1. <30% sequence identity using RCSB's method (see note 4 below).
  2. 17% of NMR results have ligand, while 76% of X-ray results have ligand. (65% of structural genomics X-ray results have ligand.)
  3. Was the same, 25%, prior to structural genomics.
  4. For "Total" non-redundant sequences, RCSB gives 25% (via global alignment) while OCA gives 16% (via local alignment).
  5. Structural genomics has contributed 10% of all entries in the Protein Data Bank. 86% of entries from structural genomics were deposited since 2004. During this time, structural genomics contributed 20% of all new entries.
See also: