RNA Secondary Structure Prediction with Mfold

IGS 350/550 Computer Laboratory

M. Rice / M. Weir

(modified from a module developed by Kelly Thayer)


Programs

Database

Objectives

RNA molecules play several important roles in the cell, including

RNA secondary structures play important roles in these functions. In class, we will discuss the Nussinov algorithm for predicting RNA structures. In today's lab, we will use Zuker's Mfold algorithm to predict structures of a tRNA. Then the predicted structures will be compared to the known crystal structure.


Step 1. Manual Prediction of Secondary Structure

See if you can predict possible secondary structures of the following portion of tRNAPhe: UCCUGUGUUCGAUCCACAGAA. Predicting by hand the stem-loop structure of this short sequence is fairly easy. However, predicting longer sequences is harder.


Step 2.  Using Mfold to Predict Secondary Structure

To fold a longer RNA sequence, go to http://www.bioinfo.rpi.edu/applications/mfold/old/rna/ and follow these instructions:

For small RNA sequences, you can perform an immediate calculation that displays the outputs in your browser's window. Also, you can modify the output display features as you like, e.g. image resolution, structure format, base numbering frequency, and structure annotation.

At the bottom of the form, click the Fold RNA button.

The output should look like the following with three predicted RNA structures.


Step 3. Evaluating the Folding Results

Scroll down to the list of structures where several files are presented for each structure.   For example, the .ct file contains a listing of the base pairs (columns 1 and 5).

The free energy values (dG) determine how energetically favorable each of the predicted structures are.  The more negative the number, the more favorable.  Positive numbers are not favored, whereas negative ones are.  Which of your three structures is the most stable and by how much?

To look at a predicted structure, click on jpg.   You can adjust the output from there, with the same options that are available on the query page. [The results will appear in a new window which may be on top; otherwise, look for it on the menu bar at the bottom of your screen.]

How do the structures predicted by Mfold compare to your manual predictions (from Step 1)?

Set the option button and then click on the image to redraw the structure with the following options:

Clicking on jpg also opened a window called Loop Free Energy Decomposition.  The Helix values are the sum of their stacking interactions.  What is the single most stabilizing interaction?  Which interactions are destabilizing?

You can also compare the dot plots for the structures.  At the bottom of the output page, go to the Dot plot folding comparison for tRNA and select the following options:

This shows the dot plot for the RNAs.  The color coding shows which structure the dot pertains to, including overlap possibilities.  (The magnification options should not be needed here because the RNA is small enough that all dots are clearly visible.)

Adjust the percent suboptimality parameter until you find two other structures that are within 3 kcal of the most stable current structure.  Could some of these alternate structures be the functional one (or ones) ?  The secondary structure prediction with the smallest free energy is not necessarily the functional conformation of the RNA.

In our example, some of the tRNA nucleotides are covalently modified (see Alberts et al. and Lodish et al.). How are the predicted structures affected if we use the following unmodified sequence ?

GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCAA


Step 4. Comparison of 2D prediction to a Crystal Structure

We will now look for the 3D structure in the Nucleic Acid Database at Rutgers and compare the secondary and tertiary structures.  From the NDB homepage at http://ndbserver.rutgers.edu/, search for the NDB ID TRNA06 -- this is the ID for PHE tRNA in yeast. In the "Coordinates + Structure Data" section, click on "Biological Unit coordinates (PDB format)". This retrieves the information about the crystal structure in a PDB format. Paste the text into a Notepad file and save it with the name 1TRA.pdb.

[You can alternatively obtain the pdb file using Protein Explorer
Run the Protein Explorer application at http://molvis.sdsc.edu/protexpl/index.htm. From the Startup options, go to "Find Any Molecules PDB ID Code". There are several search options -- for example, molecules R US allows you to enter key words. Enter appropriate key words for tRNA for PHE in yeast. (The pdb code is 1TRA -- click on this link; you can now download the raw text or Motifs-RasMol.) The pdb file can be run using RasMol.]

The following instructions show how to use this file to view the molecule in RasMol. (The RasMol is available on campus, or you can download your own copy from http://www.umass.edu/microbio/rasmol/.) [Also, with appropriate MS Explorer configuration, you can use the Protein Explorer web interface to view structures instead of RasMol.  Protein Explorer may not be as widely available on campus, but it can be run from the Protein Explorer web server.]

Using RasMol

Run the RasMol application and open your file by dragging its icon over the black display box.  There is one difference in the best structure of your tRNA and the crystal structure - do you notice any base pair that Mfold missed?  Check what was observed in the crystal structure.  In RasMol, you have to open the command line by clicking on the menu bar at the bottom of your screen the icon called RasMol command line.  Then you can type some commands in conjunction with using the pull-down menu to adjust your view. 

To obtain a labeled view of the structure in RasMol, select the Display option "Wireframe", the Colours option "Group", and the Options "Labels". You can see how the clover leaf structure folds on itself -- two of the stem loops fold back towards the main structure.

In addition, here are some useful commands in Rasmol that you can use in comparing the known tertiary (3D) structure of the tRNA and the predicted secondary (2D) structures. Equivalent operations are also possible in Protein Explorer.

The preceding figure demonstrates what you can do with a combination of these commands.

By selectively coloring the end points of predicted stems, compare the predicted base pairing with the actual base pairing. How do the actual and predicted structures differ?

[From Alberts et al. Molecular Biology of the Cell]

After looking at the crystal structure, can you speculate why the structure predicted by Mfold differs from the crystal structure ?


Step 5. Use Another RNA Sequence

You can use the list of RNA databases at the beginning of this lab to search for other RNA molecule sequences.   Find a molecule of interest to you and answer the following questions:

For example, you could examine yeast 5S ribosomal RNA

5'
GGUUGCGGCCAUAUCUACCAGAAAGCACCGUUUCCCGUCCGAUCAACUGUGUUAAGCUGGUAGAGCCUGACCGAGUAGUGUAUGGGUGACCAUACGCGAAACUCAGGUGCUGCAAUCU
3'

Compare your predicted structure with the following accepted 2D structure from structural studies (http://www.rna.icmb.utexas.edu).


Step 6. Review of Lab Objectives


Assignment
  1. Print the picture with the best prediction for tRNAPhe and label the picture with energy values listed for helices and loops.
  2. Draw your optimal predicted secondary structure for tRNAPhe above using the (i) circle and (ii) parens representations.
  3. Do the modeling algorithms always predict structures correctly? If not, what is the value of these algorithms? Can you imagine ways to improve the algorithms -- what additional information might you incorporate?

Copyright Wesleyan University 2006