Difference between revisions of "Clustal"

From Christoph's Personal Wiki
Jump to: navigation, search
(Started article)
 
 
(7 intermediate revisions by the same user not shown)
Line 4: Line 4:
 
* '''[[ClustalW]]''': command line interface.
 
* '''[[ClustalW]]''': command line interface.
 
* '''[[ClustalX]]''': This is version has a graphical user interface. It is availabe for Windows, Mac OS and Unix/Linux
 
* '''[[ClustalX]]''': This is version has a graphical user interface. It is availabe for Windows, Mac OS and Unix/Linux
 +
* '''[http://www.clustal.org/omega/ Clucal Omega]''': the latest addition to the Clustal family. It offers a significant increase in scalability over previous versions.
 
* ClustalV
 
* ClustalV
 
* ClustalG: The package is a rewrite of the well-known Clustal series of alignment packages. The main new feature of ClustalG is the recognition of input word sequences of up to six characters.
 
* ClustalG: The package is a rewrite of the well-known Clustal series of alignment packages. The main new feature of ClustalG is the recognition of input word sequences of up to six characters.
  
 
+
==Input/Output==
== Input/Output ==
+
  
 
This program accept wide range on input format. Included NBRF/PIR, [[FASTA format|Fasta]], EMBL/Swissprot, Clustal, GCC/MSF, GCG9 RSF and GDE  
 
This program accept wide range on input format. Included NBRF/PIR, [[FASTA format|Fasta]], EMBL/Swissprot, Clustal, GCC/MSF, GCG9 RSF and GDE  
Line 14: Line 14:
 
The output format can be one or many of the following: Clustal, NBRF/PIR, GCG/MSF, PHYLIP, GDE, NEXUS.
 
The output format can be one or many of the following: Clustal, NBRF/PIR, GCG/MSF, PHYLIP, GDE, NEXUS.
  
== Multiple sequence alignment ==
+
==Multiple sequence alignment==
 
There are 3 main steps:
 
There are 3 main steps:
 
#Do a pairwise alignment
 
#Do a pairwise alignment
Line 23: Line 23:
 
Other options are "Do Alignment from guide tree" and "Produce guild tree only"
 
Other options are "Do Alignment from guide tree" and "Produce guild tree only"
  
== Profile alignments ==
+
==Profile alignments==
  
 
Pairwise alignments are computed for all against all sequences, and similarities are stored in a matrix. This is then converted into a distance matrix, where the distance measures reflect the evolutionary distance between each pair of sequences.  
 
Pairwise alignments are computed for all against all sequences, and similarities are stored in a matrix. This is then converted into a distance matrix, where the distance measures reflect the evolutionary distance between each pair of sequences.  
  
From this distance matrix, a guide tree, or phylogenetic tree, for the order in which pairs of sequences are to be aligned and combined with previous alignments is constructed using a neighbour-joining clustering algorithm. Sequences are progessively aligned at each branch point, starting from the least distant pair of sequences.
+
From this distance matrix, a guide tree, or phylogenetic tree, for the order in which pairs of sequences are to be aligned and combined with previous alignments is constructed using a neighbour-joining clustering algorithm. Sequences are progressively aligned at each branch point, starting from the least distant pair of sequences.
  
== Settings ==
+
==Settings==
 
Users can align the sequences using the default setting. But sometimes it's useful to customize your own parameters.
 
Users can align the sequences using the default setting. But sometimes it's useful to customize your own parameters.
  
 
The main parameters are the gap opening penalty, and the gap extension penalty.
 
The main parameters are the gap opening penalty, and the gap extension penalty.
  
== References ==
+
==Download==
* Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, and Thompson JD (2003). Multiple sequence alignment with the Clustal series of programs. ''Nucleic Acids Research'' '''31''':3497-3500.
+
*Latest versions:
* Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, and Higgins DG (1997). The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. ''Nucleic Acids Research'' '''24''':4876-4882.
+
** ClustalW: 1.1.0 (2012-04-25)
* Higgins DG, Thompson JD, and Gibson TJ (1996). Using CLUSTAL for multiple sequence alignments. ''Methods Enzymol.'' '''266''':383-402.
+
** Clustal Omega: 2.1 (2010-11-17)
* Thompson JD, Higgins DG, and Gibson TJ (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. ''Nucleic Acids Research'' '''22''':4673-4680.
+
*Other:
* Higgins DG, Bleasby AJ, and Fuchs R (1992). CLUSTAL V: improved software for multiple sequence alignment. ''CABIOS'' '''8''':189-191.
+
** The EBI version (latest version is 2.0.12; 2009-09-24) is available from [ftp://ftp.ebi.ac.uk/pub/software/clustalw2/ European Bioinformatics Institute ftp server]. Choose ''unix'' for Unix/Linux, ''mac'' for Mac OS, or ''dos'' for Windows.
* Higgins DG and Sharp PM (1989). Fast and sensitive multiple sequence alignments on a microcomputer. ''CABIOS'' '''5''':151-153.  
+
 
* Higgins DG and Sharp PM (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. ''Gene'' '''73''':237-244.
+
==See also==
 +
*[[T-Coffee]]
  
== Download ==
+
==References==
* This programme (latest version is 1.83) is available from [ftp://ftp.ebi.ac.uk/pub/software/ European Bioinformatics Institute ftp server]. Choose ''unix'' for Unix/Linux, ''mac'' for Mac OS, or ''dos'' for Windows.
+
* Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (2003). Multiple sequence alignment with the Clustal series of programs. ''Nucleic Acids Research 31:3497-3500''.
 +
* Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997). The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. ''Nucleic Acids Research 24:4876-4882''.
 +
* Higgins DG, Thompson JD, Gibson TJ (1996). Using CLUSTAL for multiple sequence alignments. ''Methods Enzymol 266:383-402''.
 +
* Thompson JD, Higgins DG, Gibson TJ (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. ''Nucleic Acids Research 22:4673-4680''.
 +
* Higgins DG, Bleasby AJ, Fuchs R (1992). CLUSTAL V: improved software for multiple sequence alignment. ''CABIOS 8:189-191''.
 +
* Higgins DG, Sharp PM (1989). Fast and sensitive multiple sequence alignments on a microcomputer. ''CABIOS 5:151-153''.
 +
* Higgins DG, Sharp PM (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. ''Gene 73:237-244''.
  
== External links ==
+
==External links==
* [http://en.wikipedia.org/wiki/Clustal Wikipedia article on '''Clustal''']
+
*[http://www.clustal.org/ Clustal homepage]
* [http://www.ebi.ac.uk/clustalw/ '''ClustalW'''] — from the EMBL
+
*[http://www.ebi.ac.uk/clustalw/ '''ClustalW'''] — from the EMBL
* [http://bips.u-strasbg.fr/fr/Documentation/ClustalX/ '''ClustalX'''] — from Strasbourg Bioinformatics Platform - France
+
*[http://bips.u-strasbg.fr/fr/Documentation/ClustalX/ '''ClustalX'''] — from Strasbourg Bioinformatics Platform - France
* [http://www.infobiogen.fr/doc/ClustalW/clustalv.html '''ClustalV'''] — from INFOBIOGEN
+
*[http://www.infobiogen.fr/doc/ClustalW/clustalv.html '''ClustalV'''] — from INFOBIOGEN
 +
*[[wikipedia:Clustal]]
  
[[Category:Academic Research]]
 
 
[[Category:Bioinformatics]]
 
[[Category:Bioinformatics]]

Latest revision as of 02:18, 13 July 2012

Clustal is a widely used multiple alignment programme with free packages that can rapidly and simply align hundreds of nucleic acid or amino acid sequences.

There are two main variations (bold) and other "improved" variations:

  • ClustalW: command line interface.
  • ClustalX: This is version has a graphical user interface. It is availabe for Windows, Mac OS and Unix/Linux
  • Clucal Omega: the latest addition to the Clustal family. It offers a significant increase in scalability over previous versions.
  • ClustalV
  • ClustalG: The package is a rewrite of the well-known Clustal series of alignment packages. The main new feature of ClustalG is the recognition of input word sequences of up to six characters.

Input/Output

This program accept wide range on input format. Included NBRF/PIR, Fasta, EMBL/Swissprot, Clustal, GCC/MSF, GCG9 RSF and GDE

The output format can be one or many of the following: Clustal, NBRF/PIR, GCG/MSF, PHYLIP, GDE, NEXUS.

Multiple sequence alignment

There are 3 main steps:

  1. Do a pairwise alignment
  2. Create phylogenetic tree (or use user define tree)
  3. Use the phylogenetic tree to carry out a multiple alignment

These are done automatically when you select "Do Complete Alignment" Other options are "Do Alignment from guide tree" and "Produce guild tree only"

Profile alignments

Pairwise alignments are computed for all against all sequences, and similarities are stored in a matrix. This is then converted into a distance matrix, where the distance measures reflect the evolutionary distance between each pair of sequences.

From this distance matrix, a guide tree, or phylogenetic tree, for the order in which pairs of sequences are to be aligned and combined with previous alignments is constructed using a neighbour-joining clustering algorithm. Sequences are progressively aligned at each branch point, starting from the least distant pair of sequences.

Settings

Users can align the sequences using the default setting. But sometimes it's useful to customize your own parameters.

The main parameters are the gap opening penalty, and the gap extension penalty.

Download

  • Latest versions:
    • ClustalW: 1.1.0 (2012-04-25)
    • Clustal Omega: 2.1 (2010-11-17)
  • Other:

See also

References

  • Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD (2003). Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Research 31:3497-3500.
  • Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG (1997). The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Research 24:4876-4882.
  • Higgins DG, Thompson JD, Gibson TJ (1996). Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266:383-402.
  • Thompson JD, Higgins DG, Gibson TJ (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Research 22:4673-4680.
  • Higgins DG, Bleasby AJ, Fuchs R (1992). CLUSTAL V: improved software for multiple sequence alignment. CABIOS 8:189-191.
  • Higgins DG, Sharp PM (1989). Fast and sensitive multiple sequence alignments on a microcomputer. CABIOS 5:151-153.
  • Higgins DG, Sharp PM (1988). CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73:237-244.

External links