How are 3D macromolecular structures obtained?
  1. Empirical determination

    1. X-ray crystallography (88% of PDB entries)

    2. Solution NMR spectroscopy (11% of PDB entries)

    3. Electron microscopy or diffraction (0.5% of PDB entries)
      • Not atomic resolution.

  2. Theory

    1. Comparative ("Homology") Modeling.
      1. Need for empirical template limits it to ~40% of domain sequences.
      2. Errors in sequence alignment of target with template give errors in the model.
      3. Insertions, deletions, disordered template regions cannot be reliably modeled.
      4. Can be reliable for main chain fold, surface/buried; side-chain positions unreliable.
      5. ~70% success when >=60% sequence identity.
      6. ~10% failure rate even when >=90% sequence identity. (Peitsch et al., 1998)

    2. Ab initio theoretical modeling.
      • Secondary structure: ~70% accuracy.
      • Tertiary structure: accuracy too low for most purposes:
        • 2005: In about 1/4 of domains <85 residues, predictions are within 1.5 Å (C-alpha RMS) of the true structure (Bradley, Misura & Baker, 2005). (Cf. independent determinations of the same protein, ~0.5 Å.)
        • ~150 CPU-days needed per prediction. Orders of magnitude more computing power would be needed for large proteins.
        • 2008 results.
      • Biannual competitions: CASP.
      • Docking competition: CAPRI.

by Eric Martz, University of Massachusetts, 2003. Most recent update: May 2012.


Further Reading: