List of EMBOSS programs
From Christoph's Personal Wiki
Revision as of 07:32, 7 September 2007 by Christoph (Talk | contribs) (→Publication-quality display)
Contents
- 1 Acd file utilities
- 2 Merging sequences to make a consensus
- 3 Finding differences between sequences
- 4 Dot plot sequence comparisons
- 5 Global sequence alignment
- 6 Local sequence alignment
- 7 Multiple sequence alignment
- 8 Publication-quality display
- 9 Enzyme kinetics calculations
- 10 Manipulation and display of sequence annotation
- 11 Hidden markov model analysis
- 12 Information and general help for users
- 13 Menu interface(s)
- 14 Nucleic acid secondary structure
- 15 Codon usage analysis
- 16 Composition of nucleotide sequences
- 17 CpG island detection and analysis
- 18 Predictions of genes and other genomic features
- 19 Nucleic acid motif searches
- 20 Nucleic acid sequence mutation
- 21 Primer prediction
- 22 Nucleic acid profile generation and searching
- 23 Nucleic acid repeat detection
- 24 Restriction enzyme sites in nucleotide sequences
- 25 Transcription factors, promoters and terminator prediction
- 26 Translation of nucleotide sequence to protein sequence
- 27 Phylogenetic consensus methods
- 28 Phylogenetic continuous character methods
- 29 Phylogenetic discrete character methods
- 30 Phylogenetic distance matrix methods
- 31 Phylogenetic gene frequency methods
- 32 Phylogenetic tree drawing methods
- 33 Phylogenetic molecular sequence methods
- 34 Protein secondary structure
- 35 Protein tertiary structure
- 36 Composition of protein sequences
- 37 Protein motif searches
- 38 Protein sequence mutation
- 39 Protein profile generation and searching
- 40 Testing tools, not for general use
- 41 Database installation
- 42 Database indexing
- 43 Utility tools
Acd file utilities
- acdc
- ACD compiler
- acdpretty
- ACD pretty printing utility
- acdtable
- Creates an HTML table from an ACD file
- acdtrace
- ACD compiler on-screen trace
- acdvalid
- ACD file validation
Merging sequences to make a consensus
- cons
- Creates a consensus from multiple alignments
- megamerger
- Merge two large overlapping nucleic acid sequences
- merger
- Merge two overlapping sequences
Finding differences between sequences
- diffseq
- Find differences between nearly identical sequences
Dot plot sequence comparisons
- dotmatcher
- Displays a thresholded dotplot of two sequences
- dotpath
- Non-overlapping wordmatch dotplot of two sequences
- dottup
- Displays a wordmatch dotplot of two sequences
- polydot
- Displays all-against-all dotplots of a set of sequences
Global sequence alignment
- est2genome
- Align EST and genomic DNA sequences
- needle
- Needleman-Wunsch global alignment
- stretcher
- Finds the best global alignment between two sequences
- esim4
- Align an mRNA to a genomic DNA sequence
Local sequence alignment
- matcher
- Finds the best local alignments between two sequences
- seqmatchall
- All-against-all comparison of a set of sequences
- supermatcher
- Match large sequences against one or more other sequences
- water
- Smith-Waterman local alignment
- wordfinder
- Match large sequences against one or more other sequences
- wordmatch
- Finds all exact matches of a given size between 2 sequences
Multiple sequence alignment
- edialign
- Local multiple alignment of sequences
- emma
- Multiple alignment program - interface to ClustalW program
- infoalign
- Information on a multiple sequence alignment
- plotcon
- Plot quality of conservation of a sequence alignment
- prettyplot
- Displays aligned sequences, with colouring and boxing
- showalign
- Displays a multiple sequence alignment
- tranalign
- Align nucleic coding regions given the aligned proteins
- mse
- Multiple Sequence Editor
Publication-quality display
- abiview
- Reads ABI file and display the trace
- cirdna
- Draws circular maps of DNA constructs
- lindna
- Draws linear maps of DNA constructs
- pepnet
- Displays proteins as a helical net
- pepwheel
- Shows protein sequences as helices
- prettyplot
- Displays aligned sequences, with colouring and boxing
- prettyseq
- Output sequence with translated ranges
- remap
- Display sequence with restriction sites, translation etc
- seealso
- Finds programs sharing group names
- showalign
- Displays a multiple sequence alignment
- showdb
- Displays information on the currently available databases
- showfeat
- Show features of a sequence
- showseq
- Display a sequence with features, translation etc
- sixpack
- Display a DNA sequence with 6-frame translation and ORFs
- textsearch
- Search sequence documentation. Slow, use SRS and Entrez!
Enzyme kinetics calculations
- findkm
- Find Km and Vmax for an enzyme reaction
Manipulation and display of sequence annotation
- coderet
- Extract CDS, mRNA and translations from feature tables
- extractfeat
- Extract features from a sequence
- maskfeat
- Mask off features of a sequence
- showfeat
- Show features of a sequence
- twofeat
- Finds neighbouring pairs of features in sequences
Hidden markov model analysis
- oalistat
- Statistics for multiple alignment files
- ohmmalign
- Align sequences with an HMM
- ohmmbuild
- Build HMM
- ohmmcalibrate
- Calibrate a hidden Markov model
- ohmmconvert
- Convert between HMM formats
- ohmmemit
- Extract HMM sequences
- ohmmfetch
- Extract HMM from a database
- ohmmindex
- Index an HMM database
- ohmmpfam
- Align single sequence with an HMM
- ohmmsearch
- Search sequence database with an HMM
- ehmmalign
- Align sequences to an HMM profile
- ehmmbuild
- Build a profile HMM from an alignment
- ehmmcalibrate
- Calibrate HMM search statistics
- ehmmconvert
- Convert between profile HMM file formats
- ehmmemit
- Generate sequences from a profile HMM
- ehmmfetch
- Retrieve an HMM from an HMM database
- ehmmindex
- Create a binary SSI index for an HMM database
- ehmmpfam
- Search one or more sequences against an HMM database
- ehmmsearch
- Search a sequence database with a profile HMM
Information and general help for users
- infoalign
- Information on a multiple sequence alignment
- infoseq
- Displays some simple information about sequences
- seealso
- Finds programs sharing group names
- showdb
- Displays information on the currently available databases
- textsearch
- Search sequence documentation. Slow, use SRS and Entrez!
- tfm
- Displays a program's help documentation manual
- whichdb
- Search all databases for an entry
- wossname
- Finds programs by keywords in their one-line documentation
Menu interface(s)
- emnu
- Simple menu of EMBOSS applications
Nucleic acid secondary structure
- einverted
- Finds DNA inverted repeats
- vrnaalifold
- RNA alignment folding
- vrnaalifoldpf
- RNA alignment folding with partition
- vrnacofold
- RNA cofolding
- vrnacofoldconc
- RNA cofolding with concentrations
- vrnacofoldpf
- RNA cofolding with partitioning
- vrnadistance
- RNA distances
- vrnaduplex
- RNA duplex calculation
- vrnaeval
- RNA eval
- vrnaevalpair
- RNA eval with cofold
- vrnafold
- Calculate secondary structures of RNAs
- vrnafoldpf
- Secondary structures of RNAs with partition
- vrnaheat
- RNA melting
- vrnainverse
- RNA sequences matching a structure
- vrnalfold
- Calculate locally stable secondary structures of RNAs
- vrnaplot
- Plot vrnafold output
- vrnasubopt
- Calculate RNA suboptimals
Codon usage analysis
- cai
- CAI codon adaptation index
- chips
- Codon usage statistics
- codcmp
- Codon usage table comparison
- cusp
- Create a codon usage table
- syco
- Synonymous codon usage Gribskov statistic plot
Composition of nucleotide sequences
- banana
- Bending and curvature plot in B-DNA
- btwisted
- Calculates the twisting in a B-DNA sequence
- chaos
- Create a chaos game representation plot for a sequence
- compseq
- Count composition of dimer/trimer/etc words in a sequence
- dan
- Calculates DNA RNA/DNA melting temperature
- freak
- Residue/base frequency table or plot
- isochore
- Plots isochores in large DNA sequences
- sirna
- Finds siRNA duplexes in mRNA
- wordcount
- Counts words of a specified size in a DNA sequence
CpG island detection and analysis
- cpgplot
- Plot CpG rich areas
- cpgreport
- Reports all CpG rich regions
- geecee
- Calculates fractional GC content of nucleic acid sequences
- newcpgreport
- Report CpG rich areas
- newcpgseek
- Reports CpG rich regions
Predictions of genes and other genomic features
- getorf
- Finds and extracts open reading frames (ORFs)
- marscan
- Finds MAR/SAR sites in nucleic sequences
- plotorf
- Plot potential open reading frames
- showorf
- Pretty output of DNA translations
- sixpack
- Display a DNA sequence with 6-frame translation and ORFs
- syco
- Synonymous codon usage Gribskov statistic plot
- tcode
- Fickett TESTCODE statistic to identify protein-coding DNA
- wobble
- Wobble base plot
Nucleic acid motif searches
- dreg
- Regular expression search of a nucleotide sequence
- fuzznuc
- Nucleic acid pattern search
- fuzztran
- Protein pattern search after translation
- marscan
- Finds MAR/SAR sites in nucleic sequences
Nucleic acid sequence mutation
- msbar
- Mutate sequence beyond all recognition
- shuffleseq
- Shuffles a set of sequences maintaining composition
Primer prediction
- eprimer3
- Picks PCR primers and hybridization oligos
- primersearch
- Searches DNA sequences for matches with primer pairs
- stssearch
- Search a DNA database for matches with a set of STS primers
Nucleic acid profile generation and searching
- profit
- Scan a sequence or database with a matrix or profile
- prophecy
- Creates matrices/profiles from multiple alignments
- prophet
- Gapped alignment for profiles
Nucleic acid repeat detection
- einverted
- Finds DNA inverted repeats
- equicktandem
- Finds tandem repeats
- etandem
- Looks for tandem repeats in a nucleotide sequence
- palindrome
- Looks for inverted repeats in a nucleotide sequence
Restriction enzyme sites in nucleotide sequences
- recoder
- Remove restriction sites but maintain same translation
- redata
- Search REBASE for enzyme name, references, suppliers etc
- remap
- Display sequence with restriction sites, translation etc
- restover
- Find restriction enzymes producing specific overhang
- restrict
- Finds restriction enzyme cleavage sites
- showseq
- Display a sequence with features, translation etc
- silent
- Silent mutation restriction enzyme scan
Transcription factors, promoters and terminator prediction
- tfscan
- Scans DNA sequences for transcription factors
Translation of nucleotide sequence to protein sequence
- backtranambig
- Back translate a protein sequence to ambiguous codons
- backtranseq
- Back translate a protein sequence
- coderet
- Extract CDS, mRNA and translations from feature tables
- plotorf
- Plot potential open reading frames
- prettyseq
- Output sequence with translated ranges
- remap
- Display sequence with restriction sites, translation etc
- showorf
- Pretty output of DNA translations
- showseq
- Display a sequence with features, translation etc
- sixpack
- Display a DNA sequence with 6-frame translation and ORFs
- transeq
- Translate nucleic acid sequences
Phylogenetic consensus methods
- econsense
- Majority-rule and strict consensus tree
- fconsense
- Majority-rule and strict consensus tree
- ftreedist
- Distances between trees
- ftreedistpair
- Distances between two sets of trees
Phylogenetic continuous character methods
- econtml
- Continuous character Maximum Likelihood method
- econtrast
- Continuous character Contrasts
- fcontrast
- Continuous character Contrasts
Phylogenetic discrete character methods
- eclique
- Largest clique program
- edollop
- Dollo and polymorphism parsimony algorithm
- edolpenny
- Penny algorithm Dollo or polymorphism
- efactor
- Multistate to binary recoding program
- emix
- Mixed parsimony algorithm
- epenny
- Penny algorithm, branch-and-bound
- fclique
- Largest clique program
- fdollop
- Dollo and polymorphism parsimony algorithm
- fdolpenny
- Penny algorithm Dollo or polymorphism
- ffactor
- Multistate to binary recoding program
- fmix
- Mixed parsimony algorithm
- fmove
- Interactive mixed method parsimony
- fpars
- Discrete character parsimony
- fpenny
- Penny algorithm, branch-and-bound
Phylogenetic distance matrix methods
- efitch
- Fitch-Margoliash and Least-Squares Distance Methods
- ekitsch
- Fitch-Margoliash method with contemporary tips
- eneighbor
- Phylogenies from distance matrix by N-J or UPGMA method
- ffitch
- Fitch-Margoliash and Least-Squares Distance Methods
- fkitsch
- Fitch-Margoliash method with contemporary tips
- fneighbor
- Phylogenies from distance matrix by N-J or UPGMA method
Phylogenetic gene frequency methods
- egendist
- Genetic Distance Matrix program
- fcontml
- Gene frequency and continuous character Maximum Likelihood
- fgendist
- Compute genetic distances from gene frequencies
Phylogenetic tree drawing methods
- distmat
- Creates a distance matrix from multiple alignments
- ednacomp
- DNA compatibility algorithm
- ednadist
- Nucleic acid sequence Distance Matrix program
- ednainvar
- Nucleic acid sequence Invariants method
- ednaml
- Phylogenies from nucleic acid Maximum Likelihood
- ednamlk
- Phylogenies from nucleic acid Maximum Likelihood with clock
- ednapars
- DNA parsimony algorithm
- ednapenny
- Penny algorithm for DNA
- eprotdist
- Protein distance algorithm
- eprotpars
- Protein parsimony algorithm
- erestml
- Restriction site Maximum Likelihood method
- eseqboot
- Bootstrapped sequences algorithm
- fdiscboot
- Bootstrapped discrete sites algorithm
- fdnacomp
- DNA compatibility algorithm
- fdnadist
- Nucleic acid sequence Distance Matrix program
- fdnainvar
- Nucleic acid sequence Invariants method
- fdnaml
- Estimates nucleotide phylogeny by maximum likelihood
- fdnamlk
- Estimates nucleotide phylogeny by maximum likelihood
- fdnamove
- Interactive DNA parsimony
- fdnapars
- DNA parsimony algorithm
- fdnapenny
- Penny algorithm for DNA
- fdolmove
- Interactive Dollo or Polymorphism Parsimony
- ffreqboot
- Bootstrapped genetic frequencies algorithm
- fproml
- Protein phylogeny by maximum likelihood
- fpromlk
- Protein phylogeny by maximum likelihood
- fprotdist
- Protein distance algorithm
- fprotpars
- Protein parsimony algorithm
- frestboot
- Bootstrapped restriction sites algorithm
- frestdist
- Distance matrix from restriction sites or fragments
- frestml
- Restriction site maximum Likelihood method
- fseqboot
- Bootstrapped sequences algorithm
- fseqbootall
- Bootstrapped sequences algorithm
Phylogenetic molecular sequence methods
- fdrawgram
- Plots a cladogram- or phenogram-like rooted tree diagram
- fdrawtree
- Plots an unrooted tree diagram
- fretree
- Interactive tree rearrangement
Protein secondary structure
- garnier
- Predicts protein secondary structure
- helixturnhelix
- Report nucleic acid binding motifs
- hmoment
- Hydrophobic moment calculation
- pepcoil
- Predicts coiled coil regions
- pepnet
- Displays proteins as a helical net
- pepwheel
- Shows protein sequences as helices
- tmap
- Displays membrane spanning regions
- topo
- Draws an image of a transmembrane protein
Protein tertiary structure
- psiphi
- Phi and psi torsion angles from protein coordinates
- domainreso
- Remove low resolution domains from a DCF file
- domainalign
- Generate alignments (DAF file) for nodes in a DCF file
- domainrep
- Reorder DCF file to identify representative structures
- seqalign
- Extend alignments (DAF file) with sequences (DHF file)
- seqfraggle
- Removes fragment sequences from DHF files
- seqsearch
- Generate PSI-BLAST hits (DHF file) from a DAF file
- seqsort
- Remove ambiguous classified sequences from DHF files
- seqwords
- Generates DHF files from keyword search of UniProt
- libgen
- Generate discriminating elements from alignments
- matgen3d
- Generate a 3D-1D scoring matrix from CCF files
- rocon
- Generates a hits file from comparing two DHF files
- rocplot
- Performs ROC analysis on hits files
- siggen
- Generates a sparse protein signature from an alignment
- siggenlig
- Generate ligand-binding signatures from a CON file
- sigscan
- Generate hits (DHF file) from a signature search
- sigscanlig
- Search ligand-signature library & write hits (LHF file)
- contacts
- Generate intra-chain CON files from CCF files
- interface
- Generate inter-chain CON files from CCF files
Composition of protein sequences
- backtranambig
- Back translate a protein sequence to ambiguous codons
- backtranseq
- Back translate a protein sequence
- charge
- Protein charge plot
- checktrans
- Reports STOP codons and ORF statistics of a protein
- compseq
- Count composition of dimer/trimer/etc words in a sequence
- emowse
- Protein identification by mass spectrometry
- freak
- Residue/base frequency table or plot
- iep
- Calculates the isoelectric point of a protein
- mwcontam
- Shows molwts that match across a set of files
- mwfilter
- Filter noisy molwts from mass spec output
- octanol
- Displays protein hydropathy
- pepinfo
- Plots simple amino acid properties in parallel
- pepstats
- Protein statistics
- pepwindow
- Displays protein hydropathy
- pepwindowall
- Displays protein hydropathy of a set of sequences
Protein motif searches
- antigenic
- Finds antigenic sites in proteins
- digest
- Protein proteolytic enzyme or reagent cleavage digest
- epestfind
- Finds PEST motifs as potential proteolytic cleavage sites
- fuzzpro
- Protein pattern search
- fuzztran
- Protein pattern search after translation
- helixturnhelix
- Report nucleic acid binding motifs
- oddcomp
- Find protein sequence regions with a biased composition
- patmatdb
- Search a protein sequence with a motif
- patmatmotifs
- Search a PROSITE motif database with a protein sequence
- pepcoil
- Predicts coiled coil regions
- preg
- Regular expression search of a protein sequence
- pscan
- Scans proteins using PRINTS
- sigcleave
- Reports protein signal cleavage sites
- omeme
- Motif detection
- emast
- Motif detection
- ememe
- Motif detection
Protein sequence mutation
- msbar
- Mutate sequence beyond all recognition
- shuffleseq
- Shuffles a set of sequences maintaining composition
Protein profile generation and searching
- profit
- Scan a sequence or database with a matrix or profile
- prophecy
- Creates matrices/profiles from multiple alignments
- prophet
- Gapped alignment for profiles
Testing tools, not for general use
- crystalball
- Answers every drug discovery question about a sequence
- myseq
- Demonstration of sequence reading
- mytest
- Demonstration of sequence reading
Database installation
- aaindexextract
- Extract data from AAINDEX
- cutgextract
- Extract data from CUTG
- printsextract
- Extract data from PRINTS
- prosextract
- Build the PROSITE motif database for use by patmatmotifs
- rebaseextract
- Extract data from REBASE
- tfextract
- Extract data from TRANSFAC
- cathparse
- Generates DCF file from raw CATH files
- domainnr
- Removes redundant domains from a DCF file
- domainseqs
- Adds sequence records to a DCF file
- domainsse
- Add secondary structure records to a DCF file
- scopparse
- Generate DCF file from raw SCOP files
- ssematch
- Search a DCF file for secondary structure matches
- allversusall
- Sequence similarity data from all-versus-all comparison
- seqnr
- Removes redundancy from DHF files
- domainer
- Generates domain CCF files from protein CCF files
- hetparse
- Converts heterogen group dictionary to EMBL-like format
- pdbparse
- Parses PDB files and writes protein CCF files
- pdbplus
- Add accessibility & secondary structure to a CCF file
- pdbtosp
- Convert swissprot:PDB codes file to EMBL-like format
- sites
- Generate residue-ligand CON files from CCF files
Database indexing
- dbiblast
- Index a BLAST database
- dbifasta
- Database indexing for fasta file databases
- dbiflat
- Index a flat file database
- dbigcg
- Index a GCG formatted database
- dbxfasta
- Database b+tree indexing for fasta file databases
- dbxflat
- Database b+tree indexing for flat file databases
- dbxgcg
- Database b+tree indexing for GCG formatted databases
Utility tools
- embossdata
- Finds or fetches data files read by EMBOSS programs
- embossversion
- Writes the current EMBOSS version number