Difference between revisions of "NEXUS file format"
(→Orginial example (from paper): note about code) |
|||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | '''NEXUS''' is the file format used by many popular programs like GDA, Paup*, Mesquite, ModelTest, '''[[MrBayes]]''', and MacClade. Nexus file names often have a .nxs or .nex extension. | + | '''NEXUS''' is the file format used by many popular programs like GDA, Paup*, Mesquite, ModelTest, '''[[MrBayes]]''', and MacClade. Nexus file names often have a <tt>.nxs</tt> or <tt>.nex</tt> extension. |
− | The NEXUS format conveys data organized according to the character state data model, in which the features of operational taxonomic units (OTUs) (e.g., species, individuals, genes, genomes, etc.) are observable states of underlying homologous characters. For instance, in a protein sequence alignment, proteins are the OTUs, alignment columns are characters, and amino | + | The NEXUS format conveys data organized according to the character state data model, in which the features of operational taxonomic units (OTUs) (e.g., species, individuals, genes, genomes, etc.) are observable states of underlying homologous characters. For instance, in a protein sequence alignment, proteins are the OTUs, alignment columns are characters, and [[amino acid]]s (or gaps) are states. In evolutionary analysis, it is typical to consider differences as the result of state transitions that take place on branches of a tree, therefore the NEXUS file provides a means to represent a tree (in the standard Newick (a.k.a. New Hampshire) format). |
− | ==Syntactic structure== | + | == Syntactic structure == |
− | The syntactic structure of a NEXUS file is as follows: | + | The syntactic structure of a '''NEXUS file''' is as follows: |
<pre> | <pre> | ||
Line 13: | Line 13: | ||
end; | end; | ||
[ < another block with commands > ] | [ < another block with commands > ] | ||
+ | </pre> | ||
+ | |||
+ | * The syntax for the <tt>TREES</tt> block is | ||
+ | <pre> | ||
+ | BEGIN TREES; | ||
+ | [Translate arbitrary-token-used-in-tree-description valid-taxon-name | ||
+ | [, arbitrary-token-used-in-tree-description valid-taxon-name ...];] | ||
+ | [Tree [*] tree-name=tree-specification;] | ||
+ | END; | ||
+ | </pre> | ||
+ | |||
+ | * Example syntax for a <tt>TREES</tt> block in a NEXUS file | ||
+ | <pre> | ||
+ | BEGIN TAXA; | ||
+ | TaxLabels Scarabaeus Drosophila Aranaeus; | ||
+ | END; | ||
+ | |||
+ | BEGIN TREES; | ||
+ | Translate beetle Scarabaeus, fly Drosophila, spider Aranaeus; | ||
+ | Tree tree1 = ((1,2),3); | ||
+ | Tree tree2 = ((beetle,fly),spider); | ||
+ | Tree tree3 = ((Scarabaeus,Drosophila),Aranaeus); | ||
+ | END; | ||
</pre> | </pre> | ||
Line 20: | Line 43: | ||
{| align="center" style="border: 1px solid #999; background-color:#FFFFFF" | {| align="center" style="border: 1px solid #999; background-color:#FFFFFF" | ||
|- | |- | ||
− | ! colspan=" | + | ! colspan="2" bgcolor="#EFEFEF" | '''Some important public blocks''' |
|-align="center" bgcolor="#1188ee" | |-align="center" bgcolor="#1188ee" | ||
!Name | !Name | ||
Line 31: | Line 54: | ||
|SETS || assigns names to sets of characters or OTUs | |SETS || assigns names to sets of characters or OTUs | ||
|--bgcolor="#eeeeee" | |--bgcolor="#eeeeee" | ||
− | |ASSUMPTIONS || | + | |ASSUMPTIONS || houses assumptions about the data or gives general directions as to how to treat them (e.g., which characters are to be excluded from consideration) |
|- | |- | ||
|CODONS || specifies codons and their genetic codes | |CODONS || specifies codons and their genetic codes | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |DATA || equivalent to a <tt>CHARACTERS</tt> block in which the <tt>NewTaxa</tt> subcommand is included in the <tt>Dimensions</tt> command | ||
+ | |- | ||
+ | |TREES || stores information about trees | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |UNALIGNED || | ||
+ | |- | ||
+ | |DISTANCES || contains distance matrices | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |SETS || stores sets of objects (characters, states, taxa, etc.) | ||
|} | |} | ||
− | <div align="center">'' | + | <div align="center">''source: Maddison et al., 1997''</div> |
</div> | </div> | ||
<br clear="all" /> | <br clear="all" /> | ||
+ | <div style="float:left; margin:0px 20px 20px 0px;"> | ||
+ | {| align="center" style="border: 1px solid #999; background-color:#FFFFFF" | ||
+ | |- | ||
+ | ! colspan="3" bgcolor="#EFEFEF" | '''Some important commands''' | ||
+ | |-align="center" bgcolor="#1188ee" | ||
+ | !Name | ||
+ | !Block | ||
+ | !Description | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |TaxLabels || CHARACTERS || allows specification of the names of the taxa | ||
+ | |- | ||
+ | |CharLabels || CHARACTERS || label for a character (column) | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |StateLabels || CHARACTERS || label for a state (the type of an instance of a character) | ||
+ | |- | ||
+ | |CharStateLabels || CHARACTERS || combined label for a character and its states | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |CharSet || SETS || specifies and names a set of characters | ||
+ | |- | ||
+ | |TaxSet || SETS || give a name to some set of OTUs | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |GeneticCode || CODONS || specify a genetic code | ||
+ | |- | ||
+ | |CodeSet || CODONS || associate a code with a <tt>CharSet</tt> or <tt>TaxSet</tt> | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |Tree || TREES || specify a "'''[http://evolution.genetics.washington.edu/phylip/newicktree.html Newick tree]'''" | ||
+ | |- | ||
+ | |CodonPosSet || || | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |StateSet || || | ||
+ | |- | ||
+ | |ChangeSet || || | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |TreeSet || || | ||
+ | |- | ||
+ | |CharPartition || || define partition of characters | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |TaxPartition || || define partition of taxa | ||
+ | |- | ||
+ | |TreePartition || || define partition of trees | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |UserType || || | ||
+ | |- | ||
+ | |WtSet || || specifies the weights of each character (standard object definition command) | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |TypeSet || || specifies the type assigned to each character as used in [[Parsimony analysis|parsimony analysis]] | ||
+ | |- | ||
+ | |ExSet || || specifies which characters are to be excluded from consideration | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |AncStates || || allows specification of ancestral states | ||
+ | |- | ||
+ | ! colspan="3" bgcolor="#FFFFFF" | '''Common''' | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |Dimensions || || specifies the number of characters. | ||
+ | |- | ||
+ | |Format || || specifies the format of the data <tt>Matrix</tt> (a '''crucial''' command) | ||
+ | |--bgcolor="#eeeeee" | ||
+ | |Eliminate || || allows specification of a list of characters that are to be excluded from consideration. | ||
+ | |- | ||
+ | |Matrix || || contains a sequence of taxon names and state information for that taxon | ||
+ | |} | ||
+ | <div align="center">''source: Maddison et al., 1997''</div> | ||
+ | </div> | ||
+ | <br clear="all" /> | ||
+ | |||
+ | === <tt>Format</tt> subcommands === | ||
+ | The following are possible formatting subcommands: | ||
+ | |||
+ | * <tt>DataType = { standard | DNA | RNA | nucleotide | protein | continuous }</tt> | ||
+ | * <tt>RespectCase</tt> | ||
+ | * <tt>Missing</tt> | ||
+ | * <tt>Gap</tt> | ||
+ | * <tt>Symbols</tt> | ||
+ | * <tt>Equate</tt> | ||
+ | * <tt>MatchChar</tt> | ||
+ | * <tt>[No]Labels</tt> | ||
+ | * <tt>Transpose</tt> | ||
+ | * <tt>Interleave</tt> | ||
+ | * <tt>Items</tt> | ||
+ | * <tt>StatesFormat</tt> | ||
+ | * <tt>[No]Tokens</tt> | ||
+ | |||
+ | === NEXUS Objects === | ||
+ | |||
+ | Many of the commands in a NEXUS file define objects or specify characteristics about them. All objects can be labeled (given names). Duplicate names should be avoided as should names that differ only in case. | ||
+ | |||
+ | List of currently defined objects: | ||
+ | * taxa | ||
+ | * characters | ||
+ | * states | ||
+ | * trees | ||
+ | * genetic codes | ||
+ | * sets (of taxa, characters, states, classes of changes between states, trees) | ||
+ | * partitions (of taxa, trees, characters) | ||
+ | * weight sets | ||
+ | * types | ||
+ | * type sets | ||
+ | * character exclusion sets | ||
+ | * ancestral states | ||
+ | * codon position sets | ||
+ | |||
+ | ; Definition list : list of definitions | ||
+ | |||
+ | == Example Trees == | ||
+ | |||
+ | Two character state trees and the NEXUS commands that define them. The first tree has an unnamed state. | ||
+ | <pre> | ||
+ | 2 3 4 6 | ||
+ | \ / \ / | ||
+ | \ / \ / | ||
+ | * 3 5 | ||
+ | | | / | ||
+ | | |/ | ||
+ | 1 2 | ||
+ | | | | ||
+ | | | | ||
+ | 0 1 | ||
+ | first second | ||
+ | |||
+ | USERTYPE first (CSTREE) = (((2,3))1)0; | ||
+ | USERTYPE second (CSTREE) = (((4,6)3,5)2)1; | ||
+ | </pre> | ||
− | ==Example NEXUS files== | + | == Example NEXUS files == |
− | ===Basic=== | + | === Basic === |
<pre> | <pre> | ||
#NEXUS | #NEXUS | ||
Line 66: | Line 221: | ||
</pre> | </pre> | ||
− | === | + | === Simple example (orginal from paper) === |
+ | <pre> | ||
+ | #NEXUS | ||
+ | BEGIN TAXA; | ||
+ | Dimensions NTax=4; | ||
+ | TaxLabels fish frog snake mouse; | ||
+ | END; | ||
+ | |||
+ | BEGIN CHARACTERS; | ||
+ | Dimensions NChar=20; | ||
+ | Format DataType=DNA; | ||
+ | Matrix | ||
+ | fish ACATA GAGGG TACCT CTAAG | ||
+ | frog ACATA GAGGG TACCT CTAAG | ||
+ | snake ACATA GAGGG TACCT CTAAG | ||
+ | mouse ACATA GAGGG TACCT CTAAG | ||
+ | END; | ||
+ | |||
+ | BEGIN TREES; | ||
+ | Tree best=(fish, (frog, (snake, mouse))); | ||
+ | END; | ||
+ | </pre> | ||
+ | |||
+ | === Complex example (orginial from paper) === | ||
''note: block names in bold (<nowiki><b></nowiki>); commands underlined (<nowiki><u></nowiki>).'' | ''note: block names in bold (<nowiki><b></nowiki>); commands underlined (<nowiki><u></nowiki>).'' | ||
<pre> | <pre> | ||
Line 115: | Line 293: | ||
===DNA=== | ===DNA=== | ||
<pre> | <pre> | ||
− | # | + | #NEXUS |
− | + | BEGIN DATA; | |
− | + | Dimensions NTax=10 NChar=705; | |
− | + | Format DataType=DNA Interleave=yes Gap=- Missing=?; | |
− | + | Matrix | |
− | Cow | + | Cow ATGGC ATATC CCATA CAACT AGGAT TCCAA GATGC AACAT CACCA ATCAT AGAAG AACTA |
Carp ATGGCACACCCAACGCAACTAGGTTTCAAGGACGCGGCCATACCCGTTATAGAGGAACTT | Carp ATGGCACACCCAACGCAACTAGGTTTCAAGGACGCGGCCATACCCGTTATAGAGGAACTT | ||
Chicken ATGGCCAACCACTCCCAACTAGGCTTTCAAGACGCCTCATCCCCCATCATAGAAGAGCTC | Chicken ATGGCCAACCACTCCCAACTAGGCTTTCAAGACGCCTCATCCCCCATCATAGAAGAGCTC | ||
Line 252: | Line 430: | ||
Frog AACTGATCTTCATCAATACTA---GAAGCATCACTA------AGA | Frog AACTGATCTTCATCAATACTA---GAAGCATCACTA------AGA | ||
; | ; | ||
− | + | END; | |
</pre> | </pre> | ||
Line 310: | Line 488: | ||
</pre> | </pre> | ||
− | ==References== | + | == References == |
− | * Maddison DR, Swofford DL, and Maddison WP (1997). NEXUS: An extensible file format for systematic information. ''Syst | + | * Maddison DR, Swofford DL, and Maddison WP (1997). NEXUS: An extensible file format for systematic information. ''Syst Biol'' '''46''':590-621. ([http://workshop.molecularevolution.org/resources/references/files/maddison_et_al_1997.pdf PDF]) |
− | ==External links== | + | == External links == |
* [http://www.molevol.org/camel/projects/nexus/ molevol.org - NEXUS] | * [http://www.molevol.org/camel/projects/nexus/ molevol.org - NEXUS] | ||
+ | * [http://tolweb.org/nexus/ Some NEXUS information] | ||
+ | * [http://en.wikipedia.org/wiki/Cladogram Wikipedia article on '''cladogram'''] | ||
− | [[Category: | + | {{Phylogenetics}} |
+ | [[Category:Phylogenetics]] |
Latest revision as of 04:55, 13 September 2006
NEXUS is the file format used by many popular programs like GDA, Paup*, Mesquite, ModelTest, MrBayes, and MacClade. Nexus file names often have a .nxs or .nex extension.
The NEXUS format conveys data organized according to the character state data model, in which the features of operational taxonomic units (OTUs) (e.g., species, individuals, genes, genomes, etc.) are observable states of underlying homologous characters. For instance, in a protein sequence alignment, proteins are the OTUs, alignment columns are characters, and amino acids (or gaps) are states. In evolutionary analysis, it is typical to consider differences as the result of state transitions that take place on branches of a tree, therefore the NEXUS file provides a means to represent a tree (in the standard Newick (a.k.a. New Hampshire) format).
Contents
Syntactic structure
The syntactic structure of a NEXUS file is as follows:
#NEXUS begin < blockname >; < command > < argument > [additional argument]; [ < another command with args >; ] end; [ < another block with commands > ]
- The syntax for the TREES block is
BEGIN TREES; [Translate arbitrary-token-used-in-tree-description valid-taxon-name [, arbitrary-token-used-in-tree-description valid-taxon-name ...];] [Tree [*] tree-name=tree-specification;] END;
- Example syntax for a TREES block in a NEXUS file
BEGIN TAXA; TaxLabels Scarabaeus Drosophila Aranaeus; END; BEGIN TREES; Translate beetle Scarabaeus, fly Drosophila, spider Aranaeus; Tree tree1 = ((1,2),3); Tree tree2 = ((beetle,fly),spider); Tree tree3 = ((Scarabaeus,Drosophila),Aranaeus); END;
Each of the pre-defined types of public blocks may appear only once. The TAXA block is the only necessary block. There are some restrictions on the ordering of blocks, and on the ordering of commands within a block. Application-specific "private" blocks are also possible. NEXUS keywords are not case-sensitive. Names of BLOCKS in upper case, shown here, are only for mnemonic purposes.
Some important public blocks | |
---|---|
Name | Description |
TAXA | specifies OTUs in data set |
CHARACTERS | specifies characters |
SETS | assigns names to sets of characters or OTUs |
ASSUMPTIONS | houses assumptions about the data or gives general directions as to how to treat them (e.g., which characters are to be excluded from consideration) |
CODONS | specifies codons and their genetic codes |
DATA | equivalent to a CHARACTERS block in which the NewTaxa subcommand is included in the Dimensions command |
TREES | stores information about trees |
UNALIGNED | |
DISTANCES | contains distance matrices |
SETS | stores sets of objects (characters, states, taxa, etc.) |
Some important commands | ||
---|---|---|
Name | Block | Description |
TaxLabels | CHARACTERS | allows specification of the names of the taxa |
CharLabels | CHARACTERS | label for a character (column) |
StateLabels | CHARACTERS | label for a state (the type of an instance of a character) |
CharStateLabels | CHARACTERS | combined label for a character and its states |
CharSet | SETS | specifies and names a set of characters |
TaxSet | SETS | give a name to some set of OTUs |
GeneticCode | CODONS | specify a genetic code |
CodeSet | CODONS | associate a code with a CharSet or TaxSet |
Tree | TREES | specify a "Newick tree" |
CodonPosSet | ||
StateSet | ||
ChangeSet | ||
TreeSet | ||
CharPartition | define partition of characters | |
TaxPartition | define partition of taxa | |
TreePartition | define partition of trees | |
UserType | ||
WtSet | specifies the weights of each character (standard object definition command) | |
TypeSet | specifies the type assigned to each character as used in parsimony analysis | |
ExSet | specifies which characters are to be excluded from consideration | |
AncStates | allows specification of ancestral states | |
Common | ||
Dimensions | specifies the number of characters. | |
Format | specifies the format of the data Matrix (a crucial command) | |
Eliminate | allows specification of a list of characters that are to be excluded from consideration. | |
Matrix | contains a sequence of taxon names and state information for that taxon |
Format subcommands
The following are possible formatting subcommands:
- DataType = { standard | DNA | RNA | nucleotide | protein | continuous }
- RespectCase
- Missing
- Gap
- Symbols
- Equate
- MatchChar
- [No]Labels
- Transpose
- Interleave
- Items
- StatesFormat
- [No]Tokens
NEXUS Objects
Many of the commands in a NEXUS file define objects or specify characteristics about them. All objects can be labeled (given names). Duplicate names should be avoided as should names that differ only in case.
List of currently defined objects:
- taxa
- characters
- states
- trees
- genetic codes
- sets (of taxa, characters, states, classes of changes between states, trees)
- partitions (of taxa, trees, characters)
- weight sets
- types
- type sets
- character exclusion sets
- ancestral states
- codon position sets
- Definition list
- list of definitions
Example Trees
Two character state trees and the NEXUS commands that define them. The first tree has an unnamed state.
2 3 4 6 \ / \ / \ / \ / * 3 5 | | / | |/ 1 2 | | | | 0 1 first second USERTYPE first (CSTREE) = (((2,3))1)0; USERTYPE second (CSTREE) = (((4,6)3,5)2)1;
Example NEXUS files
Basic
#NEXUS BEGIN TAXA; dimensions ntax=4; taxlabels A B C D; END; BEGIN CHARACTERS; dimensions nchar=5; format datatype=protein gap=-; charlabels 1 2 3 4 Five; matrix A MA-LL B MA-LE C MEATY D ME-TE END; BEGIN TREES; tree "basic bush" = ((A:1,B:1):1,(C:1,D:1):1); END;
Simple example (orginal from paper)
#NEXUS BEGIN TAXA; Dimensions NTax=4; TaxLabels fish frog snake mouse; END; BEGIN CHARACTERS; Dimensions NChar=20; Format DataType=DNA; Matrix fish ACATA GAGGG TACCT CTAAG frog ACATA GAGGG TACCT CTAAG snake ACATA GAGGG TACCT CTAAG mouse ACATA GAGGG TACCT CTAAG END; BEGIN TREES; Tree best=(fish, (frog, (snake, mouse))); END;
Complex example (orginial from paper)
note: block names in bold (<b>); commands underlined (<u>).
BEGIN <b>TAXA</b>; <u>DIMENSIONS</u> ntax=26; <u>TAXLABELS</u> O_volvulus_AAB64227.1 O_volvulus_AAB64226.1 C_elegans_AAF39759.1 C_elegans_AAA83577.1 S_cerevisiae_CAA89634.1 C_albicans_AAC12872.1 S_pombe_CAB57444.1 N_crassa_AAA63780.1 M_musculus_AAA40121.1 C_capitata_AAA57249.1 D_virilis_CAA32060.1 D_erecta_AAF23595.1 D_orena_AAF23594.1 D_teissieri_AAF23599.1 D_yakuba_AAF23598.1 D_melanogaster_AAF50095.1 D_mauritiana_AAF23597.1 D_sechellia_AAF23596.1 D_simulans_CAA33720.1 Z_mays_AAB49913.1 O_sativa_AAC14464.1 O_sativa_AAC14465.1 A_thaliana_AAF99769.1 P_tremuloides_AAD01605.1 A_thaliana_BAB09468.1 A_thaliana_AAD29823.2; END; BEGIN <b>CHARACTERS</b>; <u>DIMENSIONS</u> ntax=26 nchar=30; <u>FORMAT</u> datatype=protein gap=- missing=?; <u>CHARLABELS</u> 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120; <u>MATRIX</u> M_musculus_AAA40121.1 QGTIHFEQKASGE--PVVLSGQITGLTE-G C_capitata_AAA57249.1 KGTVHFEQQDAKS--PVLVTGEVNGLAK-G N_crassa_AAA63780.1 KGTVIFEQESESA--PTTITYDISGNDPNA <font color="red">--stuff deleted here--</font> D_simulans_CAA33720.1 KGTVFFEQESSGT--PVKVSGEVCGLAK-G S_cerevisiae_CAA89634.1 SGVVKFEQASESE--PTTVSYEIAGNSPNA S_pombe_CAB57444.1 SGVVTFEQVDQNS--QVSVIVDLVGNDANA; END; BEGIN <b>ASSUMPTIONS</b>; <u>WTSET</u> MySoapWeights (VECTOR) = 1 1 1 1 1 1 1 1 0.83 0.8 0.8 0.8 0.8 0.8 0.71 0.71 1 1 1 1 1 1 1 1 1 1 1 1 1 1; END; BEGIN <b>TREES</b>; <u>TREE</u> "Cu-Zn Superoxide Dismutase" = (((((O_volvulus_AAB64227.1:0.31741,O_volvulus_AAB64226.1:0.13498): 0.20268[1],(C_elegans_AAF39759.1:0.14579,C_elegans_AAA83577.1:0.27311):0.2533[1]):0.12655[0.98], ((S_cerevisiae_CAA89634.1:0.28255,C_albicans_AAC12872.1:0.25631):0.08358[0.91],(S_pombe_CAB57444.1: 0.3159,N_crassa_AAA63780.1:0.1635):0.11954[0.97]):0.17514[1]):0.08988[0.77],(M_musculus_AAA40121.1: 0.49149,(C_capitata_AAA57249.1:0.18945,(D_virilis_CAA32060.1:0.11453,(((D_erecta_AAF23595.1:0.00661, D_orena_AAF23594.1:0.00769):0.00497[0.92],(D_teissieri_AAF23599.1:0.004,D_yakuba_AAF23598.1:0.01012): 0.0073[0.87]):0.01271[0.88],(((D_melanogaster_AAF50095.1:0.00836,D_mauritiana_AAF23597.1:0.00552): 0.00203[0.28],D_sechellia_AAF23596.1:0.01103):0.00398[0.7],D_simulans_CAA33720.1:0.00595):0.00739[0.75]): 0.11795[1]):0.11754[1]):0.12932[1]):0.10326[1]):0.0712[0.9],(((((Z_mays_AAB49913.1:0.05142, O_sativa_AAC14464.1:0.09031):0.02799[0.98],O_sativa_AAC14465.1:0.06915):0.05245[0.99], (A_thaliana_AAF99769.1:0.17064,P_tremuloides_AAD01605.1:0.1075):0.08023[1]):0.08596[1], A_thaliana_BAB09468.1:0.46052):0.06401[0.75],A_thaliana_AAD29823.2:0.42442):0.14252[0.94]); END;
DNA
#NEXUS BEGIN DATA; Dimensions NTax=10 NChar=705; Format DataType=DNA Interleave=yes Gap=- Missing=?; Matrix Cow ATGGC ATATC CCATA CAACT AGGAT TCCAA GATGC AACAT CACCA ATCAT AGAAG AACTA Carp ATGGCACACCCAACGCAACTAGGTTTCAAGGACGCGGCCATACCCGTTATAGAGGAACTT Chicken ATGGCCAACCACTCCCAACTAGGCTTTCAAGACGCCTCATCCCCCATCATAGAAGAGCTC Human ATGGCACATGCAGCGCAAGTAGGTCTACAAGACGCTACTTCCCCTATCATAGAAGAGCTT Loach ATGGCACATCCCACACAATTAGGATTCCAAGACGCGGCCTCACCCGTAATAGAAGAACTT Mouse ATGGCCTACCCATTCCAACTTGGTCTACAAGACGCCACATCCCCTATTATAGAAGAGCTA Rat ATGGCTTACCCATTTCAACTTGGCTTACAAGACGCTACATCACCTATCATAGAAGAACTT Seal ATGGCATACCCCCTACAAATAGGCCTACAAGATGCAACCTCTCCCATTATAGAGGAGTTA Whale ATGGCATATCCATTCCAACTAGGTTTCCAAGATGCAGCATCACCCATCATAGAAGAGCTC Frog ATGGCACACCCATCACAATTAGGTTTTCAAGACGCAGCCTCTCCAATTATAGAAGAATTA Cow CTTCACTTTCATGACCACACGCTAATAATTGTCTTCTTAATTAGCTCATTAGTACTTTAC Carp CTTCACTTCCACGACCACGCATTAATAATTGTGCTCCTAATTAGCACTTTAGTTTTATAT Chicken GTTGAATTCCACGACCACGCCCTGATAGTCGCACTAGCAATTTGCAGCTTAGTACTCTAC Human ATCACCTTTCATGATCACGCCCTCATAATCATTTTCCTTATCTGCTTCCTAGTCCTGTAT Loach CTTCACTTCCATGACCATGCCCTAATAATTGTATTTTTGATTAGCGCCCTAGTACTTTAT Mouse ATAAATTTCCATGATCACACACTAATAATTGTTTTCCTAATTAGCTCCTTAGTCCTCTAT Rat ACAAACTTTCATGACCACACCCTAATAATTGTATTCCTCATCAGCTCCCTAGTACTTTAT Seal CTACACTTCCATGACCACACATTAATAATTGTGTTCCTAATTAGCTCATTAGTACTCTAC Whale CTACACTTTCACGATCATACACTAATAATCGTTTTTCTAATTAGCTCTTTAGTTCTCTAC Frog CTTCACTTCCACGACCATACCCTCATAGCCGTTTTTCTTATTAGTACGCTAGTTCTTTAC Cow ATTATTTCACTAATACTAACGACAAAGCTGACCCATACAAGCACGATAGATGCACAAGAA Carp ATTATTACTGCAATGGTATCAACTAAACTTACTAATAAATATATTCTAGACTCCCAAGAA Chicken CTTCTAACTCTTATACTTATAGAAAAACTATCA---TCAAACACCGTAGATGCCCAAGAA Human GCCCTTTTCCTAACACTCACAACAAAACTAACTAATACTAACATCTCAGACGCTCAGGAA Loach GTTATTATTACAACCGTCTCAACAAAACTCACTAACATATATATTTTGGACTCACAAGAA Mouse ATCATCTCGCTAATATTAACAACAAAACTAACACATACAAGCACAATAGATGCACAAGAA Rat ATTATTTCACTAATACTAACAACAAAACTAACACACACAAGCACAATAGACGCCCAAGAA Seal ATTATCTCACTTATACTAACCACGAAACTCACCCACACAAGTACAATAGACGCACAAGAA Whale ATTATTACCCTAATGCTTACAACCAAATTAACACATACTAGTACAATAGACGCCCAAGAA Frog ATTATTACTATTATAATAACTACTAAACTAACTAATACAAACCTAATGGACGCACAAGAG Cow GTAGAGACAATCTGAACCATTCTGCCCGCCATCATCTTAATTCTAATTGCTCTTCCTTCT Carp ATCGAAATCGTATGAACCATTCTACCAGCCGTCATTTTAGTACTAATCGCCCTGCCCTCC Chicken GTTGAACTAATCTGAACCATCCTACCCGCTATTGTCCTAGTCCTGCTTGCCCTCCCCTCC Human ATAGAAACCGTCTGAACTATCCTGCCCGCCATCATCCTAGTCCTCATCGCCCTCCCATCC Loach ATTGAAATCGTATGAACTGTGCTCCCTGCCCTAATCCTCATTTTAATCGCCCTCCCCTCA Mouse GTTGAAACCATTTGAACTATTCTACCAGCTGTAATCCTTATCATAATTGCTCTCCCCTCT Rat GTAGAAACAATTTGAACAATTCTCCCAGCTGTCATTCTTATTCTAATTGCCCTTCCCTCC Seal GTGGAAACGGTGTGAACGATCCTACCCGCTATCATTTTAATTCTCATTGCCCTACCATCA Whale GTAGAAACTGTCTGAACTATCCTCCCAGCCATTATCTTAATTTTAATTGCCTTGCCTTCA Frog ATCGAAATAGTGTGAACTATTATACCAGCTATTAGCCTCATCATAATTGCCCTTCCATCC Cow TTACGAATTCTATACATAATAGATGAAATCAATAACCCATCTCTTACAGTAAAAACCATA Carp CTACGCATCCTGTACCTTATAGACGAAATTAACGACCCTCACCTGACAATTAAAGCAATA Chicken CTCCAAATCCTCTACATAATAGACGAAATCGACGAACCTGATCTCACCCTAAAAGCCATC Human CTACGCATCCTTTACATAACAGACGAGGTCAACGATCCCTCCCTTACCATCAAATCAATT Loach CTACGAATTCTATATCTTATAGACGAGATTAATGACCCCCACCTAACAATTAAGGCCATG Mouse CTACGCATTCTATATATAATAGACGAAATCAACAACCCCGTATTAACCGTTAAAACCATA Rat CTACGAATTCTATACATAATAGACGAGATTAATAACCCAGTTCTAACAGTAAAAACTATA Seal TTACGAATCCTCTACATAATGGACGAGATCAATAACCCTTCCTTGACCGTAAAAACTATA Whale TTACGGATCCTTTACATAATAGACGAAGTCAATAACCCCTCCCTCACTGTAAAAACAATA Frog CTTCGTATCCTATATTTAATAGATGAAGTTAATGATCCACACTTAACAATTAAAGCAATC Cow GGACATCAGTGATACTGAAGCTATGAGTATACAGATTATGAGGACTTAAGCTTCGACTCC Carp GGACACCAATGATACTGAAGTTACGAGTATACAGACTATGAAAATCTAGGATTCGACTCC Chicken GGACACCAATGATACTGAACCTATGAATACACAGACTTCAAGGACCTCTCATTTGACTCC Human GGCCACCAATGGTACTGAACCTACGAGTACACCGACTACGGCGGACTAATCTTCAACTCC Loach GGGCACCAATGATACTGAAGCTACGAGTATACTGATTATGAAAACTTAAGTTTTGACTCC Mouse GGGCACCAATGATACTGAAGCTACGAATATACTGACTATGAAGACCTATGCTTTGATTCA Rat GGACACCAATGATACTGAAGCTATGAATATACTGACTATGAAGACCTATGCTTTGACTCC Seal GGACATCAGTGATACTGAAGCTATGAGTACACAGACTACGAAGACCTGAACTTTGACTCA Whale GGTCACCAATGATATTGAAGCTATGAGTATACCGACTACGAAGACCTAAGCTTCGACTCC Frog GGCCACCAATGATACTGAAGCTACGAATATACTAACTATGAGGATCTCTCATTTGACTCT Cow TACATAATTCCAACATCAGAATTAAAGCCAGGGGAGCTACGACTATTAGAAGTCGATAAT Carp TATATAGTACCAACCCAAGACCTTGCCCCCGGACAATTCCGACTTCTGGAAACAGACCAC Chicken TACATAACCCCAACAACAGACCTCCCCCTAGGCCACTTCCGCCTACTAGAAGTCGACCAT Human TACATACTTCCCCCATTATTCCTAGAACCAGGCGACCTGCGACTCCTTGACGTTGACAAT Loach TACATAATCCCCACCCAGGACCTAACCCCTGGACAATTCCGGCTACTAGAGACAGACCAC Mouse TATATAATCCCAACAAACGACCTAAAACCTGGTGAACTACGACTGCTAGAAGTTGATAAC Rat TACATAATCCCAACCAATGACCTAAAACCAGGTGAACTTCGTCTATTAGAAGTTGATAAT Seal TATATGATCCCCACACAAGAACTAAAGCCCGGAGAACTACGACTGCTAGAAGTAGACAAT Whale TATATAATCCCAACATCAGACCTAAAGCCAGGAGAACTACGATTATTAGAAGTAGATAAC Frog TATATAATTCCAACTAATGACCTTACCCCTGGACAATTCCGGCTGCTAGAAGTTGATAAT Cow CGAGTTGTACTACCAATAGAAATAACAATCCGAATGTTAGTCTCCTCTGAAGACGTATTA Carp CGAATAGTTGTTCCAATAGAATCCCCAGTCCGTGTCCTAGTATCTGCTGAAGACGTGCTA Chicken CGCATTGTAATCCCCATAGAATCCCCCATTCGAGTAATCATCACCGCTGATGACGTCCTC Human CGAGTAGTACTCCCGATTGAAGCCCCCATTCGTATAATAATTACATCACAAGACGTCTTG Loach CGAATGGTTGTTCCCATAGAATCCCCTATTCGCATTCTTGTTTCCGCCGAAGATGTACTA Mouse CGAGTCGTTCTGCCAATAGAACTTCCAATCCGTATATTAATTTCATCTGAAGACGTCCTC Rat CGGGTAGTCTTACCAATAGAACTTCCAATTCGTATACTAATCTCATCCGAAGACGTCCTG Seal CGAGTAGTCCTCCCAATAGAAATAACAATCCGCATACTAATCTCATCAGAAGATGTACTC Whale CGAGTTGTCTTACCTATAGAAATAACAATCCGAATATTAGTCTCATCAGAAGACGTACTC Frog CGAATAGTAGTCCCAATAGAATCTCCAACCCGACTTTTAGTTACAGCCGAAGACGTCCTC Cow CACTCATGAGCTGTGCCCTCTCTAGGACTAAAAACAGACGCAATCCCAGGCCGTCTAAAC Carp CATTCTTGAGCTGTTCCATCCCTTGGCGTAAAAATGGACGCAGTCCCAGGACGACTAAAT Chicken CACTCATGAGCCGTACCCGCCCTCGGGGTAAAAACAGACGCAATCCCTGGACGACTAAAT Human CACTCATGAGCTGTCCCCACATTAGGCTTAAAAACAGATGCAATTCCCGGACGTCTAAAC Loach CACTCCTGGGCCCTTCCAGCCATGGGGGTAAAGATAGACGCGGTCCCAGGACGCCTTAAC Mouse CACTCATGAGCAGTCCCCTCCCTAGGACTTAAAACTGATGCCATCCCAGGCCGACTAAAT Rat CACTCATGAGCCATCCCTTCACTAGGGTTAAAAACCGACGCAATCCCCGGCCGCCTAAAC Seal CACTCATGAGCCGTACCGTCCCTAGGACTAAAAACTGATGCTATCCCAGGACGACTAAAC Whale CACTCATGGGCCGTACCCTCCTTGGGCCTAAAAACAGATGCAATCCCAGGACGCCTAAAC Frog CACTCGTGAGCTGTACCCTCCTTGGGTGTCAAAACAGATGCAATCCCAGGACGACTTCAT Cow CAAACAACCCTTATATCGTCCCGTCCAGGCTTATATTACGGTCAATGCTCAGAAATTTGC Carp CAAGCCGCCTTTATTGCCTCACGCCCAGGGGTCTTTTACGGACAATGCTCTGAAATTTGT Chicken CAAACCTCCTTCATCACCACTCGACCAGGAGTGTTTTACGGACAATGCTCAGAAATCTGC Human CAAACCACTTTCACCGCTACACGACCGGGGGTATACTACGGTCAATGCTCTGAAATCTGT Loach CAAACCGCCTTTATTGCCTCCCGCCCCGGGGTATTCTATGGGCAATGCTCAGAAATCTGT Mouse CAAGCAACAGTAACATCAAACCGACCAGGGTTATTCTATGGCCAATGCTCTGAAATTTGT Rat CAAGCTACAGTCACATCAAACCGACCAGGTCTATTCTATGGCCAATGCTCTGAAATTTGC Seal CAAACAACCCTAATAACCATACGACCAGGACTGTACTACGGTCAATGCTCAGAAATCTGT Whale CAAACAACCTTAATATCAACACGACCAGGCCTATTTTATGGACAATGCTCAGAGATCTGC Frog CAAACATCATTTATTGCTACTCGTCCGGGAGTATTTTACGGACAATGTTCAGAAATTTGC Cow GGGTCAAACCACAGTTTCATACCCATTGTCCTTGAGTTAGTCCCACTAAAGTACTTTGAA Carp GGAGCTAATCACAGCTTTATACCAATTGTAGTTGAAGCAGTACCTCTCGAACACTTCGAA Chicken GGAGCTAACCACAGCTACATACCCATTGTAGTAGAGTCTACCCCCCTAAAACACTTTGAA Human GGAGCAAACCACAGTTTCATGCCCATCGTCCTAGAATTAATTCCCCTAAAAATCTTTGAA Loach GGAGCAAACCACAGCTTTATACCCATCGTAGTAGAAGCGGTCCCACTATCTCACTTCGAA Mouse GGATCTAACCATAGCTTTATGCCCATTGTCCTAGAAATGGTTCCACTAAAATATTTCGAA Rat GGCTCAAATCACAGCTTCATACCCATTGTACTAGAAATAGTGCCTCTAAAATATTTCGAA Seal GGTTCAAACCACAGCTTCATACCTATTGTCCTCGAATTGGTCCCACTATCCCACTTCGAG Whale GGCTCAAACCACAGTTTCATACCAATTGTCCTAGAACTAGTACCCCTAGAAGTCTTTGAA Frog GGAGCAAACCACAGCTTTATACCAATTGTAGTTGAAGCAGTACCGCTAACCGACTTTGAA Cow AAATGATCTGCGTCAATATTA---------------------TAA Carp AACTGATCCTCATTAATACTAGAAGACGCCTCGCTAGGAAGCTAA Chicken GCCTGATCCTCACTA------------------CTGTCATCTTAA Human ATA---------------------GGGCCCGTATTTACCCTATAG Loach AACTGGTCCACCCTTATACTAAAAGACGCCTCACTAGGAAGCTAA Mouse AACTGATCTGCTTCAATAATT---------------------TAA Rat AACTGATCAGCTTCTATAATT---------------------TAA Seal AAATGATCTACCTCAATGCTT---------------------TAA Whale AAATGATCTGTATCAATACTA---------------------TAA Frog AACTGATCTTCATCAATACTA---GAAGCATCACTA------AGA ; END;
Amino Acid
#NEXUS Begin data; Dimensions ntax=10 nchar=234; Format datatype=protein gap=- interleave; Matrix Cow MAYPMQLGFQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISLMLTTKLTHTSTMDAQE Carp MAHPTQLGFKDAAMPVMEELLHFHDHALMIVLLISTLVLYIITAMVSTKLTNKYILDSQE Chicken MANHSQLGFQDASSPIMEELVEFHDHALMVALAICSLVLYLLTLMLMEKLS-SNTVDAQE Human MAHAAQVGLQDATSPIMEELITFHDHALMIIFLICFLVLYALFLTLTTKLTNTNISDAQE Loach MAHPTQLGFQDAASPVMEELLHFHDHALMIVFLISALVLYVIITTVSTKLTNMYILDSQE Mouse MAYPFQLGLQDATSPIMEELMNFHDHTLMIVFLISSLVLYIISLMLTTKLTHTSTMDAQE Rat MAYPFQLGLQDATSPIMEELTNFHDHTLMIVFLISSLVLYIISLMLTTKLTHTSTMDAQE Seal MAYPLQMGLQDATSPIMEELLHFHDHTLMIVFLISSLVLYIISLMLTTKLTHTSTMDAQE Whale MAYPFQLGFQDAASPIMEELLHFHDHTLMIVFLISSLVLYIITLMLTTKLTHTSTMDAQE Frog MAHPSQLGFQDAASPIMEELLHFHDHTLMAVFLISTLVLYIITIMMTTKLTNTNLMDAQE Cow VETIWTILPAIILILIALPSLRILYMMDEINNPSLTVKTMGHQWYWSYEYTDYEDLSFDS Carp IEIVWTILPAVILVLIALPSLRILYLMDEINDPHLTIKAMGHQWYWSYEYTDYENLGFDS Chicken VELIWTILPAIVLVLLALPSLQILYMMDEIDEPDLTLKAIGHQWYWTYEYTDFKDLSFDS Human METVWTILPAIILVLIALPSLRILYMTDEVNDPSLTIKSIGHQWYWTYEYTDYGGLIFNS Loach IEIVWTVLPALILILIALPSLRILYLMDEINDPHLTIKAMGHQWYWSYEYTDYENLSFDS Mouse VETIWTILPAVILIMIALPSLRILYMMDEINNPVLTVKTMGHQWYWSYEYTDYEDLCFDS Rat VETIWTILPAVILILIALPSLRILYMMDEINNPVLTVKTMGHQWYWSYEYTDYEDLCFDS Seal VETVWTILPAIILILIALPSLRILYMMDEINNPSLTVKTMGHQWYWSYEYTDYEDLNFDS Whale VETVWTILPAIILILIALPSLRILYMMDEVNNPSLTVKTMGHQWYWSYEYTDYEDLSFDS Frog IEMVWTIMPAISLIMIALPSLRILYLMDEVNDPHLTIKAIGHQWYWSYEYTNYEDLSFDS Cow YMIPTSELKPGELRLLEVDNRVVLPMEMTIRMLVSSEDVLHSWAVPSLGLKTDAIPGRLN Carp YMVPTQDLAPGQFRLLETDHRMVVPMESPVRVLVSAEDVLHSWAVPSLGVKMDAVPGRLN Chicken YMTPTTDLPLGHFRLLEVDHRIVIPMESPIRVIITADDVLHSWAVPALGVKTDAIPGRLN Human YMLPPLFLEPGDLRLLDVDNRVVLPIEAPIRMMITSQDVLHSWAVPTLGLKTDAIPGRLN Loach YMIPTQDLTPGQFRLLETDHRMVVPMESPIRILVSAEDVLHSWALPAMGVKMDAVPGRLN Mouse YMIPTNDLKPGELRLLEVDNRVVLPMELPIRMLISSEDVLHSWAVPSLGLKTDAIPGRLN Rat YMIPTNDLKPGELRLLEVDNRVVLPMELPIRMLISSEDVLHSWAIPSLGLKTDAIPGRLN Seal YMIPTQELKPGELRLLEVDNRVVLPMEMTIRMLISSEDVLHSWAVPSLGLKTDAIPGRLN Whale YMIPTSDLKPGELRLLEVDNRVVLPMEMTIRMLVSSEDVLHSWAVPSLGLKTDAIPGRLN Frog YMIPTNDLTPGQFRLLEVDNRMVVPMESPTRLLVTAEDVLHSWAVPSLGVKTDAIPGRLH Cow QTTLMSSRPGLYYGQCSEICGSNHSFMPIVLELVPLKYFEKWSASML------- Carp QAAFIASRPGVFYGQCSEICGANHSFMPIVVEAVPLEHFENWSSLMLEDASLGS Chicken QTSFITTRPGVFYGQCSEICGANHSYMPIVVESTPLKHFEAWSSL------LSS Human QTTFTATRPGVYYGQCSEICGANHSFMPIVLELIPLKIFEM-------GPVFTL Loach QTAFIASRPGVFYGQCSEICGANHSFMPIVVEAVPLSHFENWSTLMLKDASLGS Mouse QATVTSNRPGLFYGQCSEICGSNHSFMPIVLEMVPLKYFENWSASMI------- Rat QATVTSNRPGLFYGQCSEICGSNHSFMPIVLEMVPLKYFENWSASMI------- Seal QTTLMTMRPGLYYGQCSEICGSNHSFMPIVLELVPLSHFEKWSTSML------- Whale QTTLMSTRPGLFYGQCSEICGSNHSFMPIVLELVPLEVFEKWSVSML------- Frog QTSFIATRPGVFYGQCSEICGANHSFMPIVVEAVPLTDFENWSSSML-EASL-- ; End;
References
- Maddison DR, Swofford DL, and Maddison WP (1997). NEXUS: An extensible file format for systematic information. Syst Biol 46:590-621. (PDF)
External links
Topics in phylogenetics |
---|
Relevant fields: phylogenetics | computational phylogenetics | molecular phylogeny | cladistics |
Basic concepts: synapomorphy | phylogenetic tree | phylogenetic network | long branch attraction |
Phylogeny inference methods: maximum parsimony | maximum likelihood | neighbour joining | UPGMA |