What percentage of the human proteome has known structure?
- ~40,000 genes in the human genome.
Drug companies solve a large number of structures
but most are not deposited in the Protein Data Bank.
- ~45,000 entries in the Protein Data Bank:
sequence-distinct entries of good quality.
- ~1,500 of these are
- These entries are mostly single domains or fragments of proteins.
Answer (homology modeling): ~40% of domains, so
~20% of whole proteins?
Solution: Structural Genomics?
This estimate does not take into account redundancy among the ~40,000 human
genes. If you know how to estimate that redundancy, please
Eric Martz, University of Massachusetts, July 2003 (revised February 2004,
September 2005, October 2006, April 2007)
Similar rough estimates were stated by
Kevin Karplus in October, 2006.