How are 3D macromolecular structures obtained?
  1. Empirical determination

    1. X-ray crystallography (83% of PDB entries)

    2. Solution NMR spectroscopy (15% of PDB entries)

    3. Electron microscopy or diffraction (0.3% of PDB entries)

  2. Theory

    1. Comparative ("Homology") Modeling.
      1. Need for empirical template limits it to ~40% of cases.
      2. Errors in sequence alignment of target with template give errors in the model.
      3. Insertions, deletions cannot be reliably modeled.
      4. Can be reliable for main chain fold, surface/buried; side-chain positions unreliable.
      5. ~70% success when >=60% sequence identity.
      6. ~10% failure rate even when >=90% sequence identity. (Peitsch et al., 1998)

    2. Ab initio theoretical modeling.
      • Secondary structure: ~70% accuracy.
      • Tertiary structure: accuracy too low for most purposes:
        • 2005: In about 1/4 of domains <85 residues, predictions are within 1.5 Å (C-alpha RMS) of the true structure (Bradley, Misura & Baker, 2005). (Cf. independent determinations of the same protein, ~0.5 Å.)
        • ~150 CPU-days needed per prediction. Orders of magnitude more computing power would be needed for large proteins.
      • Biannual competitions: CASP.
      • Docking competition: CAPRI.

by Eric Martz, University of Massachusetts, 2003. Most recent update: May 2009.


Further Reading: