Difference between revisions of "T-Coffee"

From Christoph's Personal Wiki
Jump to: navigation, search
(See also)
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''T-Coffee''' (Tree-based Consistency Objective Function For alignment Evaluation) is a multiple sequence alignment software using a progressive approach. It generates a library of pairwise alignments to guide the multiple sequence alignment. It can also combine multiple sequences alignments obtained previously and in the latest versions can use structural information from [[Protein Data Bank|PDB]] files (3D-Coffee). It has advanced features to evaluate the quality of the alignments and some capacity of identifying occurrence of motifs (Mocca). It produces alignment in the aln format ([[Clustal]]) by default, but can also produce PIR, MSF and FASTA format. The input files are always in [[FASTA format]]. Note that what T-Coffee terms "Clustal" format is sufficiently different from the output of ClustalW/X that many programs supporting Clustal format cannot read it; fortunately ClustalX ''can'' import T-Coffee output so the simplest fix for this issue is usually to import T-Coffee's output into ClustalX and then re-export.
+
'''T-Coffee''' ('''Tree-'''based '''C'''onsistency '''O'''bjective '''F'''unction '''F'''or alignm'''e'''nt '''E'''valuation) is a multiple sequence alignment software using a progressive approach. It generates a library of pairwise alignments to guide the multiple sequence alignment. It can also combine multiple sequences alignments obtained previously and in the latest versions can use structural information from [[Protein Data Bank|PDB]] files (3D-Coffee). It has advanced features to evaluate the quality of the alignments and some capacity of identifying occurrence of motifs (Mocca). It produces alignment in the aln format ([[Clustal]]) by default, but can also produce PIR, MSF and FASTA format. The input files are always in [[FASTA format]]. Note that what T-Coffee terms "Clustal" format is sufficiently different from the output of ClustalW/X that many programs supporting Clustal format cannot read it; fortunately ClustalX ''can'' import T-Coffee output so the simplest fix for this issue is usually to import T-Coffee's output into ClustalX and then re-export.
 +
 
 +
<blockquote>'''Note''': The latest version is '''[http://www.tcoffee.org/Packages/ 6.92]''' (2008-09-12)</blockquote>
 +
 
 +
==Installation==
 +
*Requirements: gcc, g77, makefile, CPAN
 +
 
 +
tar -zxvf t_coffee.tar.gz
 +
cd T-COFFEE_distribution_Version_X.XX
 +
./install
 +
 
 +
This installation procedure is semi-interactive. It will prompt questions here and there. You can interrupt it any time and resume it later.
 +
 
 +
The install procedure carries out three distinct tasks:
 +
#compilation of T-Coffee (C program);
 +
#compilation and installation of [http://www.soaplite.com/ SOAP::Lite] ([[Perl]] module); and
 +
#download/compilation and installation of all the T-Coffee companion packages required for all possible T-Coffee flavors (<code>tcoffee, expresso, 3dcoffee, mcoffee, rcoffee</code>).
 +
Except for T-Coffee, the installer will only install the packages that are NOT already on your computer. If you want a lighter or more specific installation, you can try any of the following:
 +
./install tcoffee
 +
./install rcoffee
 +
./install expresso
 +
./install 3dcoffee
 +
./install 3dcoffee
 +
 
 +
While installing <code>SOAP::Lite</code>, CPAN will ask you many questions: say "Yes" to all or type return to keep the default values. If everything went well, the procedure has created in the bin directory two executables: <code>t_coffee</code> and <code>TMalign</code> (make sure these executables are on your <code>$PATH</code>).
 +
 
 +
If you have not managed to install <code>SOAP::Lite</code>, you can re-install it anytime (from anywhere) using steps 1-2
 +
 
 +
If you cannot log as root, or if for some reason this procedure does not work, see with your system manager and/or go directly to the CPAN repository of <code>SOAP::Lite</code>. You will still be able to use the most basic functions of T-Coffee.
 +
 +
IMPORTANT: The purpose of <code>SOAP::Lite</code> is to allow T-Coffee the use of the [http://www.ebi.ac.uk/Tools/webservices/ EBI webservices] such as webblast. [[BLAST]] brings many functionalities to T-Coffee and if you cannot install SOAP we suggest you go to the "[http://www.tcoffee.org/Documentation/t_coffee/t_coffee_technical.htm Installing BLAST for T-Coffee]" section of the ''Technical Doumentation'' (in the "Installation" section). There you will find alternative ways of using BLAST without SOAP. It is also in this document that you will find all the information required for a full installation of T-Coffee.
  
 
==Usage==
 
==Usage==
 
  t_coffee foo.fa
 
  t_coffee foo.fa
the default output is <code>foo.aln</code>.
+
the default output is:
 +
*<code>foo.aln</code> (the multiple sequence alignment); and
 +
*<code>foo.dnd</code> (the guide tree in [[Newick phylogenetic tree format|Newick format]])
 +
 
 +
==Residue types==
 +
*charged: <span style="background-color:#f00; font-weight:bold;">KRDE</span>
 +
*polar: <span style="background-color:deeppink; font-weight:bold;">NQST</span>
 +
*aliphatic: <span style="background-color:#0f0; font-weight:bold;">ILMV</span>
 +
*aromatic: <span style="background-color:#ff0; font-weight:bold;">FYW</span>
 +
*others: <span style="background-color:#666; font-weight:bold;">APCGH</span>
  
 
==See also==
 
==See also==
 
*[[Clustal]]
 
*[[Clustal]]
 +
*[[MUSCLE]]
 
*[http://wikiomics.org/wiki/Bioinfo_tutorial Bioinfo tutorial] &mdash; by Wikiomics.org
 
*[http://wikiomics.org/wiki/Bioinfo_tutorial Bioinfo tutorial] &mdash; by Wikiomics.org
 +
*[http://www.inf.fu-berlin.de/inst/ag-bio/FILES/ROOT/Projects/Algorithms/lisa/index.php LiSA] &mdash; a software platform for structural alignment algorithms. (including: "T-Lara: RNA multiple Structural Alignment")
 +
*[http://biwww2.informatik.uni-freiburg.de/Software/MARNA/index.html MARNA] &mdash; a server for '''M'''ultiple '''A'''lignment of '''RNA'''s.
 +
*[http://pbil.univ-lyon1.fr/software/seaview.html SeaView] &mdash; a graphical multiple sequence alignment editor. Able to read and write various alignment formats (NEXUS, MSF, CLUSTAL, FASTA, PHYLIP, MASE). It allows to manually edit the alignment, and also to run DOT-PLOT or CLUSTALW/MUSCLE programs to locally improve the alignment.
 +
*[http://tcoffee.vital-it.ch/cgi-bin/Tcoffee/tcoffee_cgi/index.cgi?stage1=1&daction=MCOFFEE::Advanced M-Coffee] &mdash; computes a multiple sequence alignment and the associated phylogenetic tree by combining the output of several multiple sequence alignment packages (PCMA, Poa, Mafft, Muscle, T-Coffee, ClustalW, ProbCons, DialignT).
 +
 +
==References==
 +
*O'Sullivan O, Suhre K, Abergel C, Higgins DG, Notredame C (2004). "[http://www.tcoffee.org/Publications/Pdf/3DCoffee.pdf 3DCoffee: Combining Protein Sequences and Structures within Multiple Sequence Alignments][[Image:Icon_pdf.gif|PDF]]". ''J Mol Biol, 340:385-395''.
 +
*Notredame C, Higgins D, Heringa J (2000). "[http://www.tcoffee.org/Publications/Pdf/tcoffee.pdf T-Coffee: A novel method for multiple sequence alignments][[Image:Icon_pdf.gif|PDF]]". ''J Mol Biol, 302:205-217''.
 +
*Notredame C, Holme L, Higgins DG (1998). "[http://www.tcoffee.org/Publications/Ps_pdf/coffee.pdf COFFEE: A New Objective Function For Multiple Sequence Alignmnent][[Image:Icon_pdf.gif|PDF]]". ''Bioinformatics, 14(5):407-422''.
 +
*Notredame C, Abergel C (2003). "[http://www.tcoffee.org/Publications/Pdf/core.pp.pdf Using Multiple Alignment Methods to Assess the Quality of Genomic Data Analysis][[Image:Icon_pdf.gif|PDF]]", in ''Bioinformatics and Genomes: Current Perspectives'', M. Andrade, Editor., Horizon Scientific Press. p.30-50. (details on the colour scheme)
  
 
==External links==
 
==External links==
 
*[http://www.tcoffee.org T-Coffee Home Page and Server] &mdash; include a stand-alone version for download.
 
*[http://www.tcoffee.org T-Coffee Home Page and Server] &mdash; include a stand-alone version for download.
 +
*[http://www.tcoffee.org/Documentation/t_coffee/t_coffee_tutorial.htm T-Coffee tutorial]
 +
*[http://tcoffee.vital-it.ch/Doc/doc3.html Format documentation]
 +
*[http://www.tcoffee.org/Documentation/t_coffee/t_coffee_technical.htm Technical notes]
  
 
[[Category:Bioinformatics]]
 
[[Category:Bioinformatics]]

Latest revision as of 03:25, 22 September 2008

T-Coffee (Tree-based Consistency Objective Function For alignment Evaluation) is a multiple sequence alignment software using a progressive approach. It generates a library of pairwise alignments to guide the multiple sequence alignment. It can also combine multiple sequences alignments obtained previously and in the latest versions can use structural information from PDB files (3D-Coffee). It has advanced features to evaluate the quality of the alignments and some capacity of identifying occurrence of motifs (Mocca). It produces alignment in the aln format (Clustal) by default, but can also produce PIR, MSF and FASTA format. The input files are always in FASTA format. Note that what T-Coffee terms "Clustal" format is sufficiently different from the output of ClustalW/X that many programs supporting Clustal format cannot read it; fortunately ClustalX can import T-Coffee output so the simplest fix for this issue is usually to import T-Coffee's output into ClustalX and then re-export.

Note: The latest version is 6.92 (2008-09-12)

Installation

  • Requirements: gcc, g77, makefile, CPAN
tar -zxvf t_coffee.tar.gz
cd T-COFFEE_distribution_Version_X.XX
./install

This installation procedure is semi-interactive. It will prompt questions here and there. You can interrupt it any time and resume it later.

The install procedure carries out three distinct tasks:

  1. compilation of T-Coffee (C program);
  2. compilation and installation of SOAP::Lite (Perl module); and
  3. download/compilation and installation of all the T-Coffee companion packages required for all possible T-Coffee flavors (tcoffee, expresso, 3dcoffee, mcoffee, rcoffee).

Except for T-Coffee, the installer will only install the packages that are NOT already on your computer. If you want a lighter or more specific installation, you can try any of the following:

./install tcoffee
./install rcoffee
./install expresso
./install 3dcoffee
./install 3dcoffee

While installing SOAP::Lite, CPAN will ask you many questions: say "Yes" to all or type return to keep the default values. If everything went well, the procedure has created in the bin directory two executables: t_coffee and TMalign (make sure these executables are on your $PATH).

If you have not managed to install SOAP::Lite, you can re-install it anytime (from anywhere) using steps 1-2

If you cannot log as root, or if for some reason this procedure does not work, see with your system manager and/or go directly to the CPAN repository of SOAP::Lite. You will still be able to use the most basic functions of T-Coffee.

IMPORTANT: The purpose of SOAP::Lite is to allow T-Coffee the use of the EBI webservices such as webblast. BLAST brings many functionalities to T-Coffee and if you cannot install SOAP we suggest you go to the "Installing BLAST for T-Coffee" section of the Technical Doumentation (in the "Installation" section). There you will find alternative ways of using BLAST without SOAP. It is also in this document that you will find all the information required for a full installation of T-Coffee.

Usage

t_coffee foo.fa

the default output is:

  • foo.aln (the multiple sequence alignment); and
  • foo.dnd (the guide tree in Newick format)

Residue types

  • charged: KRDE
  • polar: NQST
  • aliphatic: ILMV
  • aromatic: FYW
  • others: APCGH

See also

  • Clustal
  • MUSCLE
  • Bioinfo tutorial — by Wikiomics.org
  • LiSA — a software platform for structural alignment algorithms. (including: "T-Lara: RNA multiple Structural Alignment")
  • MARNA — a server for Multiple Alignment of RNAs.
  • SeaView — a graphical multiple sequence alignment editor. Able to read and write various alignment formats (NEXUS, MSF, CLUSTAL, FASTA, PHYLIP, MASE). It allows to manually edit the alignment, and also to run DOT-PLOT or CLUSTALW/MUSCLE programs to locally improve the alignment.
  • M-Coffee — computes a multiple sequence alignment and the associated phylogenetic tree by combining the output of several multiple sequence alignment packages (PCMA, Poa, Mafft, Muscle, T-Coffee, ClustalW, ProbCons, DialignT).

References

External links