What Is "Structural Genomics"?

Crystallography is unable to solve a large percentage of protein structures anytime soon.

Structural genomics is an initiative to solve a large percentage by homology modeling. There are 9 NIH-funded structural genomics centers/consortia in the USA, plus commercial efforts and centers in other countries, for a total of >20 (>70 institutions).

Structural Genomics
  1. Identify sequence families for which no empirical template exists for homology modeling.
  2. Choose some members of each family as "targets".
    1. The Protein Data Bank maintains a target registry.
    2. In March, 2008, 166,000 targets were registered: Graph.
  3. Solve a target from each family by high-throughput crystallography, providing a new template.
    1. This is the bottleneck (Graph).
    2. Sequence redundancy in Structural Genomics results.
    3. Infer function from similar structures of known function, to be confirmed biochemically (Zhang & Kim).
  4. Homology model all members of each family, using the new templates.

By one estimate (Vitkup et al., 2001), obtaining templates for 90% of proteins would require solving 16,000 new sequence-unrelated structures. This is several times the world output of crystallography to date. About 12% of this goal has been attained by Structural Genomics Projects. Some more recent estimates (Chandonia & Brenner, 2006) would lower these percentages.

by Eric Martz, University of Massachusetts, July 2003 (updated February 2004, June 2006, April 2007).

Further reading: