Syllabus for
Protein 3D Structure Visualization & Structural Bioinformatics

Section of Computer Science
Graduate School of Frontier Biosciences, Osaka University (Japan), May 7, 11-14, 2015

This document is on-line at   workshops.molviz.org
Schedule of Times and Rooms


Lead Instructor: Professor Eric Martz
Author of FirstGlance in Jmol and Member of the Proteopedia Development Team.
Professor Emeritus, University of Massachusetts, Amherst -- emartz@microbio.umass.edu


Organizer and Primary Co-Instructor: Professor Keiichi Namba.
Co-Instructor Tohru Minamino.
Teaching assistants Yurika Yamada, Péter Horváth, Tomofumu Sakai.
Thanks to Kana Moriya Nishimura for arrangements.

Goals: This course will prepare students to understand and incorporate 3D macromolecular structure into their research and teaching. The principles of protein structure will be reviewed, including noncovalent bonds. Structural bioinformatics and genomics will be introduced. Students will learn what percentage of proteins have known 3D structures, and the importance of crystallographic models compared to homology models, or theoretical models.

Using laptop computers, students will learn how to find 3D protein molecular models for proteins in their research, how to construct homology models, and how to use FirstGlance in Jmol (adopted by the journal Nature) to investigate key structural features.
Protein structure will be related to function, evolutionary conservation and multiple-sequence alignments, and drug design. "Biological units" (specific oligomers) will be constructed and visualized. Students will learn how to prepare customized publication-quality molecular images, animations for Powerpoint slides, and how to effectively and intuitively communicate function-structure relationships with online molecular scenes in the Proteopedia.Org wiki.

Each student will prepare Powerpoint slides capturing the concepts and skills they have learned. All the software is web browser-based, easy to use, works on Windows or Mac OS X, requires no installation, is free and open-source, and is expected to be available for years to come.

    Get Started:
        Use the Chrome browser
        Installing and Enabling Java
    Each student please get started:

  1. If you brought your own laptop, you are welcome use it. (iPads will be too slow.) Lab iMac login: ask instructor.
  2. Use the Chrome browser. If you do not have it, take a few minutes to install it (www.google.com/chrome). (Chrome will be faster/smoother than Safari, and Firefox is even slower for this software. Internet Explorer and Opera are unusably slow with this software.)
  3. In the Chrome browser, go to our syllabus: Workshops.MolviZ.Org.
  4. Now you can see this document in your browser. Go to Atlas.MolviZ.Org.
  5. In the Atlas, choose any molecule deemed Straightforward and click on the link to FirstGlance. After a minute or so to load, you should see a rotating molecule. Have a look around at the information, views and tools in FirstGlance.
  6. If you have any difficulty or the molecule does not appear, or does not rotate, ask for help!

  7. Installing and Enabling Java:

    You can see molecules without Java, using JSmol. in the two main tools of this course: FirstGlance.Jmol.Org and Proteopedia.Org. For some other websites that display molecules in Jmol, you will need to install and enable Java.

    Java will also make Proteopedia and FirstGlance run faster and smoother.

    In this course, we will use all 3 forms of Jmol that work in web browsers:
    • JSmol (no Java)
    • Jmol_S (signed Java applet)
    • Jmol (unsigned Java applet, deprecated)
    See four forms of Jmol.

    1. Follow these instructions for installing/updating and enabling Java.
    2. TA's and Instructors will help you!
    3. Can you see this Gal4:DNA complex with Java (Jmol_S)?

    4. If you got Java to work, use the Preferences tab in FirstGlance to make Java the default. Then this link to Gal4:DNA should use Java (Jmol_S).

    5. Some sites use an older, unsigned Jmol Java applet. An example is this explanation of Protein Secondary Structure. Can you see the molecule? Make sure to Enable Unsigned Java Applets for this website. Later in the course, you may want to enable other websites.

    I. Protein Data Bank & PDB Codes
    Crystallographic Resolution
    Proteopedia.Org
  1. The Protein Data Bank (PDB) -- World Wide: -- USA:RCSB -- Japan:PDBj -- Europe:PDBe
  2. PDB identification code examples:


    An electron density map at 2.5 Å resolution.
    • 1pgb a small protein domain, one chain.
    • 2hhd four protein chains with ligands.
    • 1d66 protein and DNA.
    • 104d DNA/RNA hybrid.
    • 9ins protein hormone.

  3. Proteopedia.Org (Part I).
    1. Main page: green links connect text to molecular scenes.
    2. Molecules explained by users. Example:
    3. Explanations of structural biology terms and concepts, e.g. asymmetric unit, Protein Data Bank, hydrogen bonds, temperature value, etc. all at About Macromolecular Structure.
    4. Pages in Japanese, Chinese, Arabic, Turkish, Russian, etc.

  4. X-Ray Crystallography and Resolution
    • 85% of models in the PDB come from X-ray crystallography experiments.
    • X-ray crystallography produces an electron density map (EDM). USE WINDOWS!
    • The average uncertainty in an EDM is measured by its Resolution in Ångstroms:
      • 1.2 Å Excellent -- backbone and most sidechains very clear. Some hydrogens resolved.
      • 2.5 Å Good -- backbone and many sidechains clear.
      • 3.5 Å OK -- backbone and bulky sidechains mostly clear.
      • 5.0 Å Poor -- backbone mostly clear; sidechains not clear.
      • See the MOVIE.
    • See also Quality Assessment for Molecular Models.

    II. Finding molecular models of interest.
    Begin Powerpoint Slides.

  1. Finding molecular models of interest:
    Each student: please find a 3D model of a molecule related to your research or interests.
    You will use the model that you select for the rest of the class, and for your Powerpoint report.
      Ideal Model
    • PDB code for an X-ray crystal structure with good resolution <= 2.5 Ångstroms.
    • Has protein.
    • Has ligand(s).
    • Has a value for Rfree (free R).

    How To Find Models:
    • Go to UniProt.Org.
    • Find your molecule. Ask for help if needed.
    • Click on the blue Sequences button at the left.
    • Make a note of the length of the full-length amino acid sequence.
    • Click on the blue Structure button in the left column. (Examples: yeast gal4; human pla2g6; zebrafish acetylcholinesterase.)

    • Follow instructions for the Simple search, and if necessary the Advanced search, at
      Is There An Empirical Model? ("Empirical" means determined by X-ray or NMR.)
    • For any model that you find, pay attention to how much of the full-length sequence it covers.

    • Please do NOT use a homology model for your Powerpoint Assignment.

    Browsing: If you can't find a model for your protein, or you don't have a molecule in mind, look at one of these sites and pick one.
  2. Begin your Powerpoint Slides (Later, you will email them to Prof. Martz).

    III. Review of Protein Chemistry and Structure.
    Introduction to Structural Bioinformatics.

  1. Central Dogma: DNA mRNA Protein.     DNA structure in Jmol / Estructura del ADN
  2. 20 Amino acids
  3. Polypeptide chain geometry and steric restrictions
  4. Covalent and non-covalent chemical bonds
  5. Typical hydrogen bond within a protein: hydrogen donor atom is covalently bonded to hydrogen; acceptor atom is not.
  6. Secondary Structure
  7. Folding: hydrophobic collapse
  8. Protein folds cannot be reliably predicted from sequence alone (using ab initio theory).

  9. Introduction to Structural Bioinformatics
    • Why do we care about 3D macromolecular structure?
    • What are 3D structure data?
    • Where do 3D structure data come from?
    • How much 3D structure knowledge do we have?
    • Primary and Derived 3D Structure Databases

    IV. FirstGlance in Jmol for exploring any macromolecule.
    FirstGlance in Jmol (Part I).

  1. To start FirstGlance, go to FirstGlance.Jmol.Org and enter the PDB code, or upload your homology model.

    Unusually large models may take a long time to display and be sluggish to manipulate in FirstGlance. For such cases, using Java will enable much better performance in FirstGlance. Java is not needed for most models. Ask the instructor for advice. Here are instructions for Installing and Enabling Java.

  2. Explore 1izh in FirstGlance.
    1. Introduction
    2. Molecule Information Tab
      1. Year, Method.
      2. Resolution.
      3. Free R.
      4. Chain details.
      5. Sequences: Crystallized vs. Full Length. Alignment at UniProt (1d66).
      6. Abstract.
      7. Citations.
      8. Text contents of the PDB file.
    3. Views tab
      1. Top 3 rows of views:
        Secondary Structure / Cartoon / N->C Rainbow
        Composition / Hydrophobic/Polar / Charge..
        Local Uncertainty / Vines / Thin Backbone
      2. Buttons.
        Ligands+ / Water / Slab
          Potassium channel (1R3J) showing membrane surface planes (from OPM).
      3. 1pgb: Hydrophobic core: Hydrophobic/Polar, then Slab.
      4. 1pgb: Amphipathic helices and strands. (In FirstGlance, use Isolate.. on each end of a helix or strand.)
      5. Compare with the Hydrophobic/Polar View of 1bl8 or 7ahl.
    4. Resources tab
      1. See lipid bilayer boundaries (1bl8 or 7ahl).
    5. 1pgb: Tools tab with Views.
      1. Salt bridges.
      2. Cation-pi interactions.
      3. Distances.
      4. Salt bridges in Charge View (Red sidechain touching blue sidechain).
      5. Charges with Slab on.
      6. Sidechain distributions in Vines View (rings buried; charges on surface).
      7. Find (review Chart of AA): PHE, (VAL,LEU,ILE), ASN, THR

  3. Explore 9ins in FirstGlance.
    1. Tools tab
      1. Disulfides/S/Se

  4. Explore 3onz in FirstGlance. (Letter O not numeral zero!)
    1. Molecule Information Tab
      1. Two chains, not sequence identical.
      2. Missing residues.
      3. Ligands+ and non-standard residues
    2. Views tab
      1. Ligands button; smaller ligands.
      2. Hide (chain, toluene, isolated His).
    3. Tools tab
      1. Contacts
      2. Non-covalent interactions for HEM in chain A (blue chain).
    4. Resources tab
      1. Biological unit.


  5. Continue preparing slides to answer the Powerpoint Questions.

    V. Introduction to Multiple Sequence Alignment (MSA) and Conservation
    ConSurf Server
    Structure of Atomic Coordinate ("PDB") Files
  1. Evolutionary conservation identifies functional sites in protein molecules.

    1. See Introduction to evolutionary conservation.

      Effect of mutation on protein function Genetic consequence Example
      Function LOST** CONSERVED:
      mutation LOST from gene pool
      R133C*
      None NOT conserved:
      mutation remains in gene pool
      E143?*
      * in methyl CpG binding protein 2 (MeCP2), 3c2i:

         ASASPKQRRS IIRDRGPMYD DPTLPEGWTR KLKQRKSGRS AGKYDVYLIN
         PQGKAFRSKV ELIMYFEKVG DTSLDPNDFD FTVTGRGSPS RHHHHHH
               ^          ^

      ** R133C causes Rett syndrome, a severe neurological disorder.
      Gray: disordered in crystal, absent in model 3c2i.

    2. In Proteopedia, show Evolutionary Conservation. Example: 3c2i
    3. Enzyme example: ConSurf-colored sequence -- enolase 4enl in Proteopedia -- 4enl ConSurf Result -- enolase in Wikipedia.

    4. Multiple sequence alignments reveal conservation: MSA for 4ENL in black and white (printed handout).
    5. Detail of MSA with color

    6. ConSurf Mechanism.   (Details of Mechanism).
    7. There are two ConSurf Servers:
      1. ConSurfDB (DataBase) NOT WORKING IN APRIL-MAY 2015.
        • Pre-calculated for every chain in the PDB.
        • Results are shown in Proteopedia.
        • Multiple Sequence Alignments typically include proteins of more than one function, so some conservation may be hidden.
      2. ConSurf
        • Set up each job by hand.
        • Easily select sequences for a single protein function, revealing conservation (within a family of proteins performing a single function) that may be hidden in ConSurfDB.

  2. Atomic Coordinate Files
    • Formats
      • Crystallographers: "PDB format" (Human-readable; based on 1970's system that used 1928-design paper punch cards)
      • Protein Data Bank: mmCIF (macromolecular crystallographic information format)
      • US National Center for Biotechnology Information: ASN1
    • Structure of PDB Files:   What are 3D structure data?
    • PDB files are plain text -- they can be edited with a text editor.
    • Examine PDB file for 4phv at pdb.org

    VI. Evolutionary Conservation with ConSurf-DB
    Authoring Molecular Scenes in Proteopedia
    Publication-Quality Images & Animations for Powerpoint
    As you complete each section today, record your results in your Powerpoint Slides.

  1. Evolutionary Conservation: Follow the instructions for Question 14 to show conservation in your PDB code.
  2. If you have a serious research interest in the conservation pattern of your molecule (not required):
    1. You will want to do a ConSurf run where you limit the multiple sequence alignment to proteins with the same function as your molecule. Instructions.
    2. Using FirstGlance from ConSurf, you can see the conservation levels of amino acids contacting a moiety of interest.


  3. Author two scenes in Proteopedia.Org (Part II):
    1. See the help and movies under Want to Contribute? at the Main Page of Proteopedia.

    2. Login as "Student". Ask for the password.
    3. Go to the page Sandbox Reserved NN, where NN is the number assigned to you. For example, if you are assigned number 12, go to the page titled Sandbox Reserved 12.
    4. Click the tab, at the top, edit this page.
    5. Keep the {{Template:...}} at the top, but delete anything else that you did not put in this page.

    6. Click the 3D button (above the box) to insert a Jmol.
    7. Put your PDB code in the load parameter of the applet tag.
    8. Save the page (click Save page twice). You should see your molecule.

    9. Edit again, and show the Scene authoring tools.
    10. Use the load molecule tab to load your PDB code.
    11. Customize your scene: select, represent (display), color, label as you wish.
    12. Option: If you wish, you may copy a scene from FirstGlance into Proteopedia: Instructions.
    13. Use the save scene tab to save your scene.
    14. Paste the scene tag into the box above (the page text).
    15. Save the page (click Save page twice).

    16. Try the green link you made.
    17. Put a snapshot into a Powerpoint slide.
    18. Create a second scene and green link for your second Proteopedia Powerpoint slide.

      If you would like to contribute permanent content to Proteopedia, please apply for an account and password: click on request account.

  4. Continue preparing slides to answer the Powerpoint Questions.

    VII. FirstGlance in Jmol -- Part II
    Solution NMR
    Isoelectric Point
    Intrinsically Unstructured Proteins
    FirstGlance in Jmol (Part II)
  1. Solution Nuclear Magnetic Resonance (NMR)
    • Gives an ensemble of multiple models consistent with the data.
      Examples: 1abt, 1cfc, 1jsa.
    • Differences between models can reflect flexible thermal motion in solution, or simply uncertainty due to a lack of enough data. Nothing in the PDB file tells you which is the case. You need to contact the authors.
    • There is nothing in the PDB file that measures reliability. (Unlike X-ray data, where Resolution, R, and R-free measure reliability.)
  2. Charge:
    • 1d66 and nucleotide binding.
    • Challenge: how can protein charge be changed in seconds, without changing the pH?     .
    • Calculate the isoelectric point (pI) and charge at pH 7 for one chain of your protein:
      1. Follows the instructions under Charge.. in the Views tab.

  3. Intrinsicially Unstructured / Natively Disordered Proteins
      About 10% of proteins are thought to be fully disordered to support their functions, and 40% of eukaryotic proteins have at least one long disordered region. Examples.

    VIII. Flagellar Assembly
    Structural Bioinformatics and Genomics.
    Homology (Comparative) Modeling
  1. Introduction to bacterial flagellar assembly:

  2. Structural Genomics: Worldwide Protein 3D Structure Knowledge
    1. How are 3D macromolecular structures obtained? Crystallography, NMR, and homology modeling.
    2. What fraction of the human proteome has known structure? A few percent.
    3. Is Structural Genomics the answer? Not in the next few years.
    4. Intrinsicially unstructured proteins.

  3. Modeling vs. Visualization

  4. Homology (comparative) modeling: Introduction.
    1. Automated homology modeling: submit sequences (after removing any His tag!) to Swiss-Model (click on Automated Mode).
    2. Compare homology models from various methods at LOMETS. Here is an example study comparing multiple homology models. See especially comparisions in supplementary figures S4-S6.
    3. See if a structure of your molecule is in the Structural Genomics pipeline. Submit your sequence to the SG TargetDB. (Ask for help interpreting the results.)

    IX. Publication Quality Images and Animations with Polyview-3D
    Finishing Powerpoint Questions
    Animation from Polyview-3D.
    Click on the above image for
    a larger view and explanation.
  1. Make Animated PowerPoint Slides and Publication-Quality Images easily with Polyview-3D.

    • PyMol: popular with crystallographers. Beautiful views but not user friendly.
    • Just fill out an easy form, submit it, and (shortly) voila!
    • Center and orient the molecule as you wish.
    • Coloring can be customized. Highlight residues that you specify.
    • Accepts PDB files obtained from ConSurf to color your figures or slides by evolutionary conservation. Instructions.

    • Once you have an animation from Polyview-3D:
    • Windows Powerpoint: Simply drag the animation directly from the Polyview-3D web page and drop it into a powerpoint slide.
    • Mac OS X Powerpoint: (This works in Powerpoint:Mac 2008.)
      1. Control-Click (Right-Click) on the animation in the Polyview-3D web page, and select Save Image As ...
      2. Save the image to the Desktop.
      3. Drag the image file (filename ending in .gif) from the Desktop and drop it into a Powerpoint slide.
      4. The animation will run only when the slide is projected.

  2. You are now prepared to finish your Powerpoint Questions. Please email the completed PPT file to
    emartz AT microbio DOT umass DOT edu.


Additional Resources.
    Probably we will not have time in class to spend on these resources. Links are provided here in case you are interested to look at these later.
  1. Jmol in Scientific Journals
    1. Interactive 3D Complements in Proteopedia (pages that complement journal articles, similar to supplementary materials).
    2. FirstGlance in Jmol is used in Nature (via the buttons) and other journals.


    Simplified SV40 Virus Capsid.
  2. Specific Oligomers vs. Crystal Contacts
  3. Animations & Morphing

    Lac repressor bending the DNA operon. If this image is not moving, reload the page.


    For Teachers and Future Teachers

  4. High School Teacher's Resources.

  5. Bird Flu: N1 vs. Tamiflu Lesson Plan:
    • See links to background, lesson plan, morph animations of induced fit, and a cavity near Tamiflu at Proteopedia: Eric Martz's Favorites

  6. MolviZ.Org
    1. DNA, Hemoglobin, Antibody
    2. Lipid Bilayers and Gramicidin Channel
    3. Collagen
    4. Water & Ice & hydrogen bonding
    5. Toobers in Science Education

  7. About Protein Structure
  8. Building a web page that shows your favorite molecules for research or teaching.
  9. T shirts and mugs!            
    (Click images for more information.)


Keep in touch!