Lac Repressor Binding to DNA
for the Atlas of Macromolecules in Protein Explorer
Copyright © by Eric Martz, September 2004.
Permission is given to use this resource, or portions thereof, in websites or presentations provided the source,, is cited.

Many proteins that recognize specific DNA sequences are thought to bind first nonspecifically to DNA largely by interactions with the DNA backbone phosphates. The protein is thought to slide along the DNA until it encounters the cognate sequence, where it binds in the grooves to the specific DNA sequence, with higher affinity. The lactose operon repressor of E. coli is such a case.

1OSL: Lac repressor bound to nonspecific DNA 18-mer. Note flexibility of carboxy termini and straightness of DNA.
1L1M: Lac repressor bound to specific DNA 23-mer. Note bend in DNA and relative fold stability of carboxy termini.
A number of crystal structures show that sequence-specific DNA-binding protein domains often bend the DNA double helix (e.g. 1L1M). In July, 2004, NMR structures were published for the E. coli lactose operon repressor DNA-binding domains bound nonspecifically to DNA (1OSL by Kalodimos, Biris, Bonvin, Levandoski, Guennuegues, Boelens, and Kaptein). Here, the DNA was modeled as a straight B-form double helix, docked to the protein. The length of the DNA had to be an exact fit to the protein (18 base pairs); longer segments precluded structure determination by solution NMR, presumably because the protein slides along the DNA.

Below are morphs of the nonspecific protein-DNA complex converting to the specific complex. In these morphs, the longer specific DNA (23 base pairs) was substituted at the outset for the shorter nonspecific DNA (18 base pairs). (This avoids having the DNA change in length and sequence during the animation.) Consequently, in the first frame, the details of the nonspecific interactions between the protein and the DNA are incorrect. The correct interactions can be observed in the empirical result, 1OSL. The only frame in these morphs that is accurate in detail is the final one, which is the empirical result in 1L1M. Morph animations are intended to help you see the relations between the large structural differences between two empirical results (the first and last frames). This morph does represent an attempt to predict the actual trajectory of the conformational change. For more about the history and goals of molecular morphs, see the Protein Morpher.

Below are Questions for Students.

Animated multi-GIF's can be pasted into PowerPoint. Permission is given to copy these multi-GIF's into presentations or websites provided the source,, is cited.

Lac as sticks animated in Chime
Sticks script for this animation.

Spacefilling lac animated in Chime
Spacefill script for this animation.

Specific contacts of lac to DNA, animated in Chime:
With or without DNA dot surface.

Clicking on an atom identifies it in the browser status line.
Contacts scripts for this animation with or without DNA dot surface.

Morph in Protein Explorer

Render and color as you wish.

Questions for Students.

You may wish to review the structure of the B-form DNA double helix to help in answering some of these questions.

  1. Why does the lac repressor bind to DNA nonspecifically?

  2. When the lac repressor binds nonspecifically to DNA, what part of the DNA double helix does it bind to?

  3. Does DNA have a net charge, and if so, is it negative or positive in aqueous solution at pH 7?

  4. What kinds of chemical bonds are likely to be involved in nonspecific binding of the repressor protein to DNA?

  5. Does specific binding of lac repressor to DNA disrupt any of the Watson-Crick hydrogen bonds between the base pairs in the DNA strands?

  6. How do proteins such as the lac repressor recognize specific nucleotide sequences in a DNA double helix?

  7. What kinds of chemical bonds are involved in specific binding of the repressor protein to DNA?

  8. Does the lac repressor recognize specific bases in the major or minor grooves of the DNA?

  9. Why does the lac repressor bend the DNA double helix when it recognizes its specific nucleotide sequence?
Answers are available to teaching faculty who inquire with an email to providing evidence of their faculty positions, such as by reference to a school or college website listing faculty.

Methods. The morph is a linear interpolation of atomic coordinates. The empirical start and finish models are taken from ensembles obtained by NMR. The animation begins with model 9 of 1OSL, and ends with model 14 of 1LIM. Model 14 of 1L1M was chosen because it is the most representative (according to Olderado). Model 9 of 1OSL was chosen because the flexible carboxy termini are in the vicinity of those in 1L1M.

The straight, near-B-form nonspecific 18-mer DNA in 1OSL model 9 was replaced with a straight B-form model of the specific 23-mer DNA sequence in 1LIM (GAATT GTGAG CGGAT AACAA TTT). This theoretical model was generated by Model It (see instructions at under Molecules, Sources of PDB Files, under DNA Tools). (Some of the hydrogen atom names in the PDB file generated by Model It were misaligned so that DeepView did not include them; these were corrected with a text editor by shifting one character position to the right.) The specific DNA model was aligned to the nonspecific DNA in 1OSL at three points (phosphorus atoms) using DeepView's "Fit molecules from selection". The alignment points were chosen by inspection of 1OSL model 9 to represent contacts of the DNA backbone with the protein. Both T3's of 1OSL were aligned with both T4's in the theoretical model, and one T11 with A14. The RMS deviation was 1.57 Ångstroms for the three phosphorus atom pairs, and 7.22 Å for 18 phosphorus atom pairs. (Better alignments were possible but were not explored further.)

The interpolation is linear. No corrections were done to make interatomic distances or angles be chemically realistic. Because MDL Chime and RasMol assign covalent bonds based on interatomic distances, and because some of these distances become unrealistically close during linear interpolation, spurious bonds appear in some regions of the interpolated frames during the animation. These appear as "wads" of bonds or "cobwebs" as shown to the right of the proline ring in the small green and black figure at the right.

The interpolation was done with the freely available DOS program morph2.exe, inserting 10 (or in some cases 12) frames between the two empirical end points (after all hydrogens were removed with DOS program striph.exe). The resulting 12-model file was 1.6 megabytes, and was gzipped to 0.4 megabytes. For the morph showing specific contacts as hydrogen bonds, 8 carefully selected hydrogen atoms were re-inserted into the start and end models prior to doing the interpolation.

There are five hydrogen-bonded donor-acceptor pairs of atoms spacefilled in the "Specific contacts" animation: 1 pair in the minor groove, 2 pairs in the major groove for each protein chain. Three of these five pairs were selected from among those listed by Kalodimos et al. in Fig. 2C after correcting the DNA sequence numbers in Fig. 2C to agree with those in 1L1M. All of these 3 hydrogen bonds involved protein chain A. The two hydrogen bonds shown for chain B involve the same donor-acceptor pairs on the other side of 1L1M, verified to be engaged in hydrogen bonding by inspection. Visualization of the hydrogen atoms in these hydrogen bonds was crucial to confirm a hydrogen-bonded conformation, and required an enhancement to the "Contacts Controls" in Protein Explorer's QuickViews, a checkbox that displays hydrogen atoms for the contacting atoms (to be released in PE version 2.42).

In the nonspecific interaction, Kalodimos et al. report salt bridges to the DNA backbone phosphates involving Arg22, Arg35, Lys33, Lys37, and His29. In model 9 of 1OSL, the only two of these I could see were Arg22B.NH1 3.0 Å from A3D.O2P, and His29B.ND1 2.8 Å from G2D.O2P. Therefore I did not make an animation showing the nonspecific salt bridges. Presumably the salt bridges are more prominent in NMR models other than 9, but I did not explore this.

Feedback to Eric Martz.