This tutorial is for people who have tried Protein Explorer (PE), completed the 1-Hour Tour, consulted the Help, Index & Glossary, yet want a more comprehensive tour of PE's capabilities. Many hours are needed to do all of this Tutorial. It is the basis for a Molecular Visualization Lab Course. You can also use the Tutorial as a reference manual. To find out more about just one topic, use the Contents below.
This tutorial was last revised in fall, 2000, for PE version
1.72. Therefore, it does not yet incorporate new features of PE
such as the
1-Hour Tour, the new Animation capabilities, several enhancements
in Contact Surface displays, the new integration with EBI's
Probable Quaternary Structures site, new information on crystal
contacts, the pI server now linked under the Polarity5 color
scheme, SELECT Clicked (in QuickViews), or the existence of the
Help/Index/Glossary or FAQ.
New advanced features not incorporated below include
the introduction to homology modeling and methods for
mutating residues with DeepView,
The "Features of the molecule" control panel (introduced in PE 2.1, July 2003)
is not incorporated in this tutorial.
The tutorial does not know about using the "PE Site Map" (introduced in PE
2.1, July 2003) to navigate within PE.
|
|
|
HIV protease with Q8261
inhibitor bound (1HVH). |
If the top left says Advanced Explorer, it means the previous user has left PE in Expert mode. Find and click the small link Preferences at the bottom of the lower left frame. Scroll to the bottom of the Preferences list and uncheck "I'm an expert". Click the [Save] button. Now close the PE window (or click on Quit). Now re-do the previous two steps.
will come
on. After a few moments, it will change to the green ready signal,
. As you proceed below, keep an eye on
this indicator; don't push buttons until PE is ready. Clicking
buttons before PE has caught up has been known to confuse PE
(making it permanently busy). If this happens, just restart it --
no damage is done other than losing your session.
will be indicated as [Toggle Spinning].
|
|
|
"PDB" refers not only to the Protein Data Bank, but also to a data file format in which the atomic coordinates for a 3D macromolecular structure can be stored. The PDB format (see www.umass.edu/microbio/rasmol/pdb.htm for details) is one of the most commonly used formats for atomic coordinate files. In order for PE to display a molecule, it must be given an atomic coordinate file. Such a file in the PDB format for 1d66 has the standard filename "1d66.pdb".
Whenever you're not sure what an atom is, click on it to get an identification report. |
|
, which is visible on any control
panel in PE. The Molecule Information window should appear.
At the top is a brief description of the molecule. This is obtained by Protein Explorer from information in the PDB file header (HEADER and COMPND records). (PDB record names are limited to 6 characters.) The PDB header is the portion of the PDB file prior to the beginning of the atomic coordinates (ATOM or HETATM records). In some PDB files, there are multiple COMPND records (lines), in which case the Molecule Information window shows only the first line (due to a limitation in Chime).
Clicking the PDB File header link will display the entire header, obtained from the PDB website. Note whether there is more than one COMPND record. Other important information is the deposition date (top line), the organism (SOURCE records), the authors and literature citations, and the resolution in REMARK 2. The uncertainty of the position of an atom is roughly one fifth to one tenth of the resolution for high-quality data (R-factor 0.20 or less, succinctly explained on p 160 by Rhodes). For NMR results or theoretical models, resolution values are not applicable, and so not given. The HET, HETNAM, and FORMUL records are very useful to figure out what the cryptic 3-letter residue codes mean for hetero residues (see next section below). The HELIX and SHEET records are used by Chime to display secondary structure cartoons and colors.
An important record is EXPDTA, which tells you the method for determining the coordinates. 1d66 lacks an EXPDTA record. There is a large effort underway at the PDB to clean up the "legacy" PDB data to make the files uniform and their contents more machine readable, but many files have not yet been processed.
You can enter any PDB ID code into the slot in the Molecule Information Window, and these links will fetch the corresponding header. Enter 1BL8 and view its header -- it has the EXPDTA record. Notice that it has multiple COMPND records.
|
You are expected to use the Molecule Information Window whenever you load a new molecule and need to know more about it. From now on in this tutorial, we won't remind you. |
When you click on a hetero group, its name is messaged. (We'll use "messaged" to mean "displayed in the message box" at the lower left.) However, these names are limited to 3 characters and are usually cryptic. The first place to check is the PDB file header, namely the records HET, HETNAM, and FORMUL.
A good place to find out more is the Hetero-compound Information Centre - Uppsala of Gerard Kleywegt (HIC-Up). Click on the tiny link Search this site at the left, then the link (under item #2, Search Tips) QuickXS pop-up menu. Enter the 3-letter code in the slot under QuickXS II. In the result, scroll to the bottom to see a 2D structure as a GIF. Other useful links are the MDL Chime page (scroll to the bottom to see the hetero compound in Chime), and list of PDB files containing this compound.
If you should be so unlucky to encounter a hetero compound with a one-letter or two-letter name, HIC-Up won't work. Go to PDBSum (University College London), enter the PDB ID code, then at the very bottom of the result page, see the links for the hetero compounds.
Re-enter 1d66 in the slot at the Molecule Information Window, and look at RCSB's Structure Explorer page.
An important link is the Medline link which allows you to read the abstract (and in some cases, full text) of the original article.
Although they are hard to see on the blue background, the links down the left side are very powerful.
From the FirstView page, click on Explore More, which takes you to QuickViews. QuickViews is a menu system with extensive context-triggered help. It enables powerful visual exploration entirely from menus. Before QuickViews became available in summer, 2000, this kind of exploration required learning a large number of teletype-style commands ("RasMol command language").
Windows only: it is OK to drag the top edge of the frame containing
the message box down, to make more room for the middle help frame of QuickViews.
Macintosh: dragging frame boundaries in PE usually causes Netscape
to freeze or crash.
Some commands cause Chime to issue reports or messages. For example, all SELECT menu operations report the number of atoms selected (try a few and watch the message box). Notice that the number of atoms selected is also displayed below the molecule, so that it remains in view when the flow of messages scrolls the original atoms selected message out of view. As stated above, for most purposes you can ignore the information in the message box.
Notice that nothing changed in the image.
This is an important principle.
After selecting, you must specify how to render and/or color
the selected atoms.
By now you have noticed that every time you use one of the QuickViews menus, information about your choice appears in the middle frame. This information is often quite important, but there is no reason to repeat it here in the tutorial, so be sure to read it as you try each new menu option! Below, you will not be reminded to read this help, and it will be assumed that you have read it. |
Bear in mind that a real protein in aqueous medium at body temperature is vibrating a great deal from thermal motion. This means that some portions of alpha helices may fit the criteria for "alpha helix" at one instant, but not at another.
Also bear in mind that the DNA-binding domain crystallized for 1d66 is not the entire protein. The abstract of the paper (obtained by raising the Molecule Information Window, clicking on Structure Explorer, then the Medline link) describes the protein in 1d66 as a "65-residue, N-terminal fragment of the yeast transcriptional activator, GAL4". In fact, the complete protein is 881 residues long! The presence of the missing C-terminal 816 amino acids may influence the secondary structure in some regions of the fragment in 1d66.
Read the help in the middle frame.
Now, you can tell which end is which, and it is easier easier to
trace visually the chain sequence thorough folded domains.
Colors are assigned globally by residue number so that only the longest chain
has red and blue ends. Shorter chains will begin and end with colors
assigned to the same residue numbers in the longest chain.
In this case, the DNA chains are shorter than the protein chains.
Notice that the 2 DNA chains are numbered consecutively, not independently
(but there are 3 different ways DNA double helices may be numbered in
PDB files).
SELECT All
DISPLAY Spacefill
COLOR Polarity2
Use the [Water] button to hide water.
Optional:
Press the [Slab] button.
See the figure below for a brief explanation of the result.
Rotate the molecule, carefully inspecting the two long alpha
helices that don't touch the DNA.
Also try COLOR Polarity3 and Polarity5.
|
Slabbing in Chime.
1. The molecule is cut through the center.
In Chime, unlike in the figure to right, the slab plane is always parallel to the screen, and it is the portion in front of the slab plane which is hidden. In Chime, one sees the cut face and all atoms behind it (view 3 at right). |
|
The term "slab" as used in RasMol/Chime is
somewhat of a misnomer.
returns).
The movie highlights two amino acids from one chain in 1d66.
Alpha carbons are colored green.
Answer the question on the dipeptide movie.
The balls and sticks are everything not chain A, noncovalently bound to it, including DNA and chain B. Locate three regions of contacts: chain B, DNA backbone (phosphates), and DNA bases.
SELECT Ligand
Locating residues in 3D from their sequence positions is the default function of Seq3D, "Show clicked" (see radio buttons at top of Seq3D). For this purpose, it helps to press the button [Show All as Backbones] first, to simplify the view. Now, with 1d66, try clicking a few residues in the sequence in the lower panel. Try changing the display menu ("Show clicked residues in") to Spacefill, and then clicking additional residues.
Whenever you rotate the molecule with the mouse, the Seq3D window disappears behind the main Protein Explorer window. To bring it back into view: On Windows, click the Seq3D button on the taskbar; on Macintosh, use the Communicator menu.
How are the catalytic site residues are bound to the HAA ligand?
Optional to help you see noncovalent bonding relations:
(I) Chime's built-in "hbonds" display shows only the protein backbone-to-backbone hydrogen bonds in regions of recognizable secondary structure, plus Watson-Crick nucleotide base-pair hbonds. It shows none of the hbonds involving sidechains, no hbonds between chains, and no hbonds involving water. To see the bonds it does display:
returns). Answer the question
"Real bonds vs. backbones and backbone-to-backbone hydrogen bonds".
Turning to DNA:
|
| Average length of a hydrogen bond. Actual hbonds vary between 2.5 and 3.5 Angstroms. |
(B) Contact surfaces. A contact surface, colored by distance, gives a useful overview of all polar and hydrophobic interactions between any two arbitrary groups of atoms. While hydrogen bonds are not shown as bonds, their positions can be deduced from the proximities of donors and acceptors near the contact surface. Previously in this tutorial we used contact surfaces to answer Does the Gal4 DNA-binding domain recognize a DNA sequence? and to see What holds the Cd ions in place? Later in this tutorial, under Advanced Explorer, we will see how to display transparent contact surfaces showing the proximal atoms on both sides.
(C) The Noncovalent Bond Finder (NCBF). Also accessed from Advanced Explorer, the NCBF allows a detailed, distance-based exploration of hydrogen bonds. Again, NCBF does not display hbonds as bonds, but leaves their assignment to your judgement based on donor-acceptor distances.
(D) Assignment of hbonds by external programs. A number of programs are available on the web that can calculate the positions of hydrogen bonds. A planned enhancement is to enable one or more of these programs to display hydrogen bonds in PE. One such program is the HBPLUS routine of Thornton and McDonald.
Salt Bridges:
Initially, the salt bridges are colored by chain. This makes it easy to spot interchain bridges, such as the one between the larger chain A and peptide chain C here.
Cation-Pi Interactions (still with 1b07):
Individuals may find true stereoscopic viewing helpful for some kinds of images. This can be done without special equipment, but the ease with which it can be learned, and the visual fatigue which results, varies among individuals. My recommendation is that you give it a try, but if you find it too hard, don't worry about it. You can get along fine without it.
Press the [Stereo] button. Now there is a split image which can be viewed in stereo. There are two kinds of split images: for convergent ("cross-eyed") or divergent ("wall-eyed") viewing. Convergent viewing is straightforward even with large images, such as on a computer screen. However divergent viewing becomes more difficult when the separation distance between images exceeds the interpupillary distance between your eyes. Therefore PE shows convergent stereo by default. Some people find one mode of viewing easier than the other. If you have difficulty achieving convergent stereo, or find it uncomfortable, try divergent.
Convergent stereo viewing. (It is easiest if a friend reads this to you while you do it.) If you wear reading glasses, put them on. Turn off spinning. Press the [Stereo] button until the image is split. Position your head directly in front of the split image, not off to the side. Put your finger midway between the two images, near the top. Pick part of the image near the top that is easy to distinguish. Focus on your finger, and move it slowly towards your nose, keeping your focus on your finger. In the background, you should see the two images moving towards each other. The goal is to have them superimpose perfectly. If one is slightly higher than the other, tilt your head to the left or right until the alignment is perfect. Keeping your finger in focus, move it slowly towards or away from your nose until the images align perfectly. At that point, you should see the depth in the 3D view, and you can shift your attention away from your finger to the molecule. With practice, you can do this without using your finger, just crossing your eyes slightly until the images align.
Divergent stereo viewing. First, click on Preferences below the message box, uncheck Stereo convergent, and click [Back]. Now click [Stereo] until the image is split again. Some people experienced with divergent viewing can align the images even when they are widely separated, but beginners should make sure the distance between images is slightly less than the distance between the pupils of your eyes (about two inches). The quickest way to reduce the separation is:
(It is easiest if a friend reads this to you while you do it.) If you wear reading glasses, put them on. Turn off spinning. Put your nose between the two images, almost touching the computer screen. Don't worry about focus -- the image will be blurry, but you should see only ONE image. Pick a distinguishable reference point, such as the top of the image -- you want to see only one reference point, blurry but aligned. If you see two partially overlapping images, adjust the distance between images to be closer to the distance between your eyes. Once you see one (blurry) image with your nose almost touching the screen, move your head slowly away from the screen, keeping your eyes relaxed, gazing to infinity, with no effort to focus. The goal is to keep the central image aligned as you move away. Often you will need to tilt your head slightly to improve the alignment. As you move away, you should perceive three images -- the one in the middle is the aligned one. As you get far enough to focus clearly, you should see depth. With practice, some people can just gaze off to infinity and align the images (without starting close to the screen), and some can learn to do this even when the distance between images exceeds their interpupillary distance.
More information on viewing stereo pairs, including those printed in journals.
All published macromolecular structures are available from the Protein Data Bank, which has mirror websites around the world. So the PDB is the most comprehensive place to look for proteins, DNA, RNA, and polysaccharides. If the molecule you want is a popular one, you may find it most easily at PDB at a Glance, a subject categorized list. If your molecule is more esoteric, the best way to start searching the entire PDB dataset is with PDB Lite, a simple and clear search interface designed for nonspecialists who use the PDB infrequently. If you need a more advanced search, try the PDB's high-powered SearchFields (www.rcsb.org). Some searches can be done better with Jaim Prilusky's OCA (an enhanced version of what was offered by the former PDB when it was at Brookhaven National Laboratory). This link takes you to PDB Lite, where you'll see a link to OCA that also mentions cases that can be done better with one or the other searcher: OCA via PDB Lite.
Protein Explorer can display any PDB file, not just those from the Protein Data Bank proper. PDB files can be obtained through the web for thousands of small organic molecules, for theoretical models (e.g. lipid bilayers and for noncovalent assemblies such as virus capsids. Several sources are listed at the Molecules Galore page of the Molecular Visualization Freeware site.
For probable quaternary structures or "biomolecules", including virus capsids, search for the subunit module at the Protein Data Bank (see above). In PDB Lite "View/Analyze/Save" page, look near the bottom of the page for a link to Likely Quaternary Molecular Structure. In SearchFields, click on "Other Sources", then on the "MacroMolecule" link (if there is one -- not all structures have this link). For example, if you search for "poliovirus", the hits will include 2plv.pdb, and the Quaternary Structure link will offer a 31 megabyte file 2plv.mmol which includes all 60 subunits in the icosahedral capsid.
Please note that all of the above links are available from PE's main Entry Options page -- so you don't have to come back here to find these sources of molecules.
|
|
If you have no Internet connection, you can still explore molecules with PE. You will have to download and install PE and also download the PDB files of interest. You could download these items on a different computer which has an Internet connection, and transfer them to your computer via diskette, zip disk, CD, or other means.
There are a number of tricky details which can cause problems. Therefore, we strongly recommend that you use PDB Lite. It has detailed, click-by-click instructions specifically for Windows 95/98/NT or Windows 3.1 or Macintosh PPC. You get these instructions after you have found the molecule of interest, gone to the final screen ("View/Analyze/Save"), and clicked on the link Save xxxx.pdb. You can view this final screen directly:
If you are already viewing the molecule in PE, you can save the PDB file directly from Chime:
When you use the menus or buttons, PE sends commands to Chime. Frequent users of PE may wish to learn to enter commands directly. For some goals, directly entered commands are more efficient than the menus or buttons, and some results can be acheived only with manually entered commands.
Groups of commands sent to Chime in a single package are called command scripts, or just scripts. Most of the menus and buttons in PE display their scripts in PE's message window. One exception is the FirstView script that creates the first image you see after loading a new molecule. The FirstView script can be messaged with an option accessed with the [Message Control] button near the message box. In QuickViews, the most complex DISPLAY scripts are not messaged (Cation-pi, Salt Br.). However, a link in QuickViews middle help window will display these scripts.
The easiest way to begin learning the command language is by watching the messages generated by PE's buttons and menus. Then try entering these commands (or variations on them) in the slot above the message window, and observing what they do to the image. Be aware that on Windows only, messages appear in reverse order, newest at the top. This avoids having the newest messages always out of view. (The order can be changed in the Preferences. On Macintosh, Netscape's form box is intelligent enough to scroll to the bottom automatically when new text is added, so the message order defaults to newest at the bottom.)
When entering commands, PE's command aliases can save a lot of typing. For example, typing "s bb" is expanded automatically to "select backbone". To view a complete list of aliases, click on the Aliases link below the message box. As explained there, it is easy to add, delete, or modify aliases to suit your preferences.
In addition to the extensive command language understood by Chime, there are a few commands understood by Protein Explorer (intercepted and not forwarded to Chime). The most useful: typing a comma as the first character in the command slot immediately recalls the previous command (without pressing Enter). More information, and a complete list of these can be displayed by pressing the blue question mark near the command entry slot.
Here are some sources of a more systematic introduction to the command language.
Here are some tricky issues to be aware of concerning the command language.
Chime's Menu. Clicking on the MDL frank below the molecule image, at the bottom right corner, opens Chime's menu. This is a powerful menu worth getting familiar with. Many of its actions can be done better in QuickViews because they are better organized in those menus, and are accompanied by help and color keys. However, in the Select branch of Chime's menus you can see most of the predefined terms that Chime understands. This is a handy place to look up terms you may need to complete a manually entered select command.
Note that Chime's Select menu messages its commands; but all the other Chime menu branches don't (such as Display, Color).
Command scripts can be saved into plain text files (.txt, ASCII or DOS format), then played back later. Script filenames should always end in ".spt" (conventionally mapped to MIME type application/x-spt). disk files and run in PE. In order to run them from your local disk, you must download PE and set a project folder (see the Project Folder link beneath the message window).
Chime-saved scripts. Chime can automatically generate a script that will produce the image displayed. Click on MDL (lower right corner of Chime) and select Edit, Copy Chime Script. Then paste the contents of the clipboard into a text editor (Windows: Wordpad or Word; Macintosh: BBEdit or Word), being sure to save it as plain text. Chime-saved scripts tend to be unnecessarily long and may take an unnecessarily long time to produce the image -- see the method for Shortening Scripts Saved from RasMol or Chime. For more information on creating web-deliverable tutorials with your scripts, see Presenting RasMol-Saved Scripts in Chime.
|
| Initial view of 4-model NMR file 1abt.pdb, containing disulfide bonds. |
Multiple models typically occur in PDB files resulting from NMR studies. They may also occur in PDB files designed to show conformational changes (morphs), or structural alignments of two or more molecules. Your first clue that you are looking at a multiple-model ensemble depends on whether PE is in expert mode or not.
In either mode, when you go to Advanced Explorer, a special option will appear at the top of the menu, NMR Model Selection.
Let's try some examples:
Although there is no tutorial here, there is extensive documentation. Be sure to press all the blue question marks and read the help! Notice the multiple, thin backbone traces. When the initial view appears, notice the report "Number of Models" in the message window (it doesn't appear at all for 1-model files).
Brief introductions to NMR methods for determining macromolecular structures are in the Nature of 3D Structural Data overview at the PDB, many biochemistry textbooks, and Branden & Tooze. Unlike X-ray crystallography, NMR methods yield an ensemble of models, all consistent with the experimental data. At the PDB can be found data files containing anywhere from two (1hpn) to more than 40 models (1yuj: careful, more than 6 megabytes!). The authors of these data files have made a judgement as to how many models to provide. Sometimes the authors also deposit a PDB file containing one model which is an energy-minimized average of the ensemble (1cfd [191 kilobytes] is an average of 1cfc [4.7 megabytes]).
Occasionally multiple models will be published as the result of X-ray diffraction studies (1cm4 [451 kb]). More commonly, sidechains of certain residues may be given multiple positions (R134, Y192, P195, R207, L216, R219 in 1lkk [340 kb]), perhaps because of evidence for multiple conformations. Because Chime assigns bonds dynamically based on interatomic distances, and because typically these sidechain conformers are not designated as separate models in the PDB file, Chime creates "nests" of inappropriate bonds in these areas.
Illegal atoms in PDB files. The PDB format requires that every atom which is not a member of one of the standard 20 amino acids or 5 nucleotides be designated as a hetero atom (HETATM record). PDB-format files obtained from sources other than the Protein Data Bank proper sometimes contain "illegal" atoms which should be designated HETATM but are instead designated ATOM. To ensure that such atoms are obvious, they are rendered in ball and stick in PE's initial view (see shared\view1.spt, shared\view1nmr.spt).
One-residue chains. There are over 90 cases of PDB files containing one-residue chains (e.g. 4csm, 4jdw, 1arj, 1rnm). Since there is no backbone trace for a single residue, the alpha carbons of amino acids and phosphorus atoms of nucleic acids are shown as small spheres in the initial view offered by PE. This guarantees that the rare one-residue chain will not be completely invisible. (An example with two 2-residue chains is 1dn8.)
"slab" -- a misnomer? The term "slab" means a slice with thickness. Many molecular graphics programs, such as Mage, have a true slab mode in which both the portions in front of, and behind, the slab are hidden. Since in RasMol & Chime there is only one slicing plane, instead of the two needed to make a true slab, the term "slab" is somewhat of a misnomer in these programs. (See set slabmode for some interesting variations which are available.)
use of chime's menu to look at range of residues. 1qmg has 4 heteros and all 20 aa's. 1d66 lacks GMF. NB residues < 3 chars absent, e.g. CD, A T G C.