Structural Genomics

Crystallography is unable to solve a large percentage of protein structures anytime soon.

Structural genomics is an initiative to solve a large percentage by homology modeling:

Structural Genomics
  1. Identify sequence families for which no empirical template exists for homology modeling.
  2. Choose some members of each family as "targets".
    1. The Protein Data Bank maintains a target registry.
    2. In February, 2004, nearly 50,000 targets were registered (75% in the previous year: Graph).
  3. Solve a target from each family by high-throughput crystallography, providing a new template.
    1. This is the bottleneck.
    2. Infer function from similar structures of known function, to be confirmed biochemically (Zhang & Kim).
  4. Homology model all members of each family, using the new templates.
There are 9 NIH-funded structural genomics centers/consortia in the USA, plus commercial efforts and centers in other countries, for a total of >20 (>70 institutions).

By one estimate (Vitkup et al.), obtaining templates for 90% of proteins would require solving 16,000 new sequence-unrelated structures. This is about three times the world output of crystallography to date. About 3% of this goal has been attained.

by Eric Martz, University of Massachusetts, July 2003 (updated February 2004)

Further reading: