Difference between revisions of "Dot-parenthesis notation"

From Christoph's Personal Wiki
Jump to: navigation, search
(New page: The '''dot-parenthesis notation''' is used in bioinformatics to describe the secondary structure of RNA (including tRNA, rRNA, etc.). ==Example== *Example #1 ...)
 
(No difference)

Latest revision as of 01:38, 3 October 2009

The dot-parenthesis notation is used in bioinformatics to describe the secondary structure of RNA (including tRNA, rRNA, etc.).

Example

  • Example #1 (Simple case):
>AB013372
GCGCCCGUAGCUCAAUUGGAUAGAGCGUUUGACUACGGAUCAAAAGGUUAGGGGUUCGACUCCUCUCGGGCGCG
(((((((..((((.........)))).((((((....).))))).....(((((.....)))))..))))))).

In the above example, the first row is the name of the sequence (e.g., accession number, organism name, etc.) and the second row is the actual sequence in question. The third row is where the dot-parenthesis notation is used.

The string of dots and parentheses must be of the same length as the actual sequence (usually a predicted one). A dot in the string indicates that the corresponding nucleotide is unpaired. If nucleotides i and j are paired, where i < j, a left parenthesis '(' at position i and a right parenthesis ')' at position j are shown instead.

  • Example #2:
>structure
ssssddd.................(((.......)))...............................
>gca_bovine
AGCCCUGUGGUGAAUUUACACGUUGAAUUGCAAAUUCAGAGAAGCAGCUUCAAU-UCUGCCGGGGCUU
>gca_chicken
GACUCUGUAGUGAAGU-UCAUAAUGAGUUGCAAACUCGUUGAUGUACACUAA-AGUGUGCCGGGGUCU
>gca_mouse
GGUCUUAAGGUGAUA-UUCAUGUCGAAUUGCAAAUUCGAAGGUGUAGAGAAAU-CUCUACUAAGACUU
>gca_rat
AGCCUUAAGGUGAUU-AUCAUGUCGAAUUGCAAAUUCGAAGGUGUAGAGAAUCU-UCUACUAAGGCUU

In the above example, the same dot-parenthesis notation is used, however, it is describing the secondary structure for all of the four sequences (from four different organisms) below. The four 's' in the "structure" forces these nucleotides to be treated as unpaired (for any prediction algorithms) and the "d's" force those sequences to be paired with something. The middle part is forced to form specific base-pairs as indicated by parentheses.

References

External links

  • MARNA — a Multiple Alignment of RNAs prediction server (uses dot-parenthesis notation; "Example #1").
  • Pfold — an RNA fold prediction server (uses dot-parenthesis notation; "Example #2").