Difference between revisions of "GenBank"

From Christoph's Personal Wiki
Jump to: navigation, search
 
Line 20: Line 20:
 
*[http://www.ncbi.nlm.nih.gov/genomes/static/euk_g.html List of completed eukaryotic genomes] (NCBI)
 
*[http://www.ncbi.nlm.nih.gov/genomes/static/euk_g.html List of completed eukaryotic genomes] (NCBI)
 
*[ftp://ftp.ncbi.nih.gov/genbank/docs/FTv6_6.html The DDBJ/EMBL/GenBank Feature Table: Definition] — version 6.6, 2006-10.
 
*[ftp://ftp.ncbi.nih.gov/genbank/docs/FTv6_6.html The DDBJ/EMBL/GenBank Feature Table: Definition] — version 6.6, 2006-10.
 +
 +
[[Category:Bioinformatics]]

Revision as of 22:53, 16 April 2007

The GenBank (aka Genetic Sequence Data Bank) sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations.[1][2] This database is produced at National Center for Biotechnology Information (NCBI).

Statistics

  • GenBank Flat File Release 158.0 (2007-02-15)
    • 67,218,344 loci, 71,292,211,453 bases, from 67,218,344 reported sequences.[3]
    • Uncompressed, the Release 158.0 flatfiles require roughly 251 GB (sequence files only) or 263 GB (including the 'short directory', 'index' and the *.txt files).

Note: You can find the current release number by issuing the following commmand:

lynx --dump ftp://ftp.ncbi.nih.gov/genbank/GB_Release_Number

See also

References

  1. Benton D (1990). "Recent changes in the GenBank On-line Service". Nucleic Acids Research, 18(6):1517–1520.
  2. Benton D et al. (2006). "GenBank". Nucleic Acids Research, 34(Database):D16-D20.
  3. NCBI-GenBank Flat File Release 158.0 - Distribution Release Notes ('gbrel.txt') — 2007-02-15.

External links