Early and Recent Redundancy & Molecular Weight
in the Protein Data Bank
by Eric Martz, April 2007, for Protein Explorer
Method Era / Contributor Non-Redundant
Sequences1
Median
Molecular Weight
X-Ray First 270 (1972-1987) 33% 24,000
First 1,000 (1972-1991) 27% 24,000
First 5,000 (1972-1996) 24% 31,000
All (1972-present) 25% 44,500
Structural Genomics (2003-present) 60% 47,000
Traditional (2003-present) 25% 50,000
NMR First 1,000 (1989-June 1997) 54%   7,900
All (1989-present) 55%   8,600
Structural Genomics (2003-present) 57% 11,250
Traditional (2003-present) 68%   9,500

1. <30% sequence identity using RCSB's method.