What percentage of the human proteome has known structure?
- ~40,000 genes in the human genome.
- "Known"?
Drug companies solve a large number of structures
but most are not deposited in the Protein Data Bank.
- ~45,000 entries in the Protein Data Bank:
- ~7,000
sequence-distinct entries of good quality.
- ~1,500 of these are
human.
- These entries are mostly single domains or fragments of proteins.
Answer (empirical):
~2%
Answer (homology modeling): ~40% of domains, so
~20% of whole proteins?
Solution: Structural Genomics?
This estimate does not take into account redundancy among the ~40,000 human
genes. If you know how to estimate that redundancy, please
tell me
(emartz@microbio.umass.edu).
by
Eric Martz, University of Massachusetts, July 2003 (revised February 2004,
September 2005, October 2006, April 2007)
Similar rough estimates were stated by
Kevin Karplus in October, 2006.