Difference between revisions of "GenBank"
From Christoph's Personal Wiki
(→See also) |
|||
Line 4: | Line 4: | ||
*GenBank Flat File Release '''158.0''' (2007-02-15) | *GenBank Flat File Release '''158.0''' (2007-02-15) | ||
**'''67,218,344''' loci, '''71,292,211,453''' bases, from '''67,218,344''' reported sequences.<ref>[ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt NCBI-GenBank Flat File Release 158.0 - Distribution Release Notes] ('<code>gbrel.txt</code>') — 2007-02-15.</ref> | **'''67,218,344''' loci, '''71,292,211,453''' bases, from '''67,218,344''' reported sequences.<ref>[ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt NCBI-GenBank Flat File Release 158.0 - Distribution Release Notes] ('<code>gbrel.txt</code>') — 2007-02-15.</ref> | ||
− | **Uncompressed, the Release 158.0 flatfiles require roughly '''251 GB''' (sequence files only) or '''263 GB''' (including the '<code>short directory</code>', '<code>index</code>' and the <code>*.txt</code> files). | + | **Uncompressed, the Release 158.0 flatfiles require roughly '''251 GB''' (sequence files only) or '''263 GB''' (including the '<code>short directory</code>', '<code>index</code>', and the <code>*.txt</code> files). |
Note: You can find the current release number by issuing the following commmand: | Note: You can find the current release number by issuing the following commmand: | ||
Line 27: | Line 27: | ||
**''Drosophila melanogaster'' (Fruit fly) | **''Drosophila melanogaster'' (Fruit fly) | ||
*Other | *Other | ||
− | **''Encephalitozoon cuniculi'' ( | + | **''Encephalitozoon cuniculi'' (an intracellular parasite) |
===GenBank entries in the eukaryotic database=== | ===GenBank entries in the eukaryotic database=== | ||
For details please refer to the NCBI genome FTP site at: ftp://ftp.ncbi.nih.gov/genomes/ and the [http://www.ncbi.nlm.nih.gov/genomes/static/euk_g.html list of completed eukaryotic genomes] (NCBI). | For details please refer to the NCBI genome FTP site at: ftp://ftp.ncbi.nih.gov/genomes/ and the [http://www.ncbi.nlm.nih.gov/genomes/static/euk_g.html list of completed eukaryotic genomes] (NCBI). | ||
− | See the complete list here: [http://www.cbs.dtu.dk/services/FeatureExtract/contig_sum.txt contig list] (73,867 entries; 4. | + | See the complete list here: [http://www.cbs.dtu.dk/services/FeatureExtract/contig_sum.txt contig list] (73,867 entries; 4.5 MB). |
==See also== | ==See also== |
Revision as of 09:50, 17 April 2007
The GenBank (aka Genetic Sequence Data Bank) sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations.[1][2] This database is produced at National Center for Biotechnology Information (NCBI).
Contents
Statistics
- GenBank Flat File Release 158.0 (2007-02-15)
- 67,218,344 loci, 71,292,211,453 bases, from 67,218,344 reported sequences.[3]
- Uncompressed, the Release 158.0 flatfiles require roughly 251 GB (sequence files only) or 263 GB (including the '
short directory
', 'index
', and the*.txt
files).
Note: You can find the current release number by issuing the following commmand:
lynx --dump ftp://ftp.ncbi.nih.gov/genbank/GB_Release_Number
Selected Eukaryotic genomes
Note: The following are not part of the main NCBI GenBank database.
- Fungi
- Saccharomyces cerevisiae (Baker's Yeast)
- Schizosaccharomyces pombe (Fission Yeast)
- Plants
- Arabidopsis thaliana
- Vertebrates
- Canis familiaris (Dog)
- Gallus gallus (Chicken)
- Homo sapiens (Human)
- Mus musculus (Mouse)
- Rattus norvegicus (Rat)
- Invertebrates
- Apis mellifera (Honey bee)
- Caenorhabditis elegans (Nematode)
- Drosophila melanogaster (Fruit fly)
- Other
- Encephalitozoon cuniculi (an intracellular parasite)
GenBank entries in the eukaryotic database
For details please refer to the NCBI genome FTP site at: ftp://ftp.ncbi.nih.gov/genomes/ and the list of completed eukaryotic genomes (NCBI).
See the complete list here: contig list (73,867 entries; 4.5 MB).
See also
- Tab file format (aka "
gb2tab
") - build_gbff_cu.pl — Build a non-redundant cumulative GenBank flatfile from a set of GenBank Incremental Update (GIU) flatfiles provided by the NCBI. Documentation can be found here.
- ffidx.pl — Generate an index file containing the sequence identifier and byte-offset of each record in a flatfile which contains biological sequence data.
References
- ↑ Benton D (1990). "Recent changes in the GenBank On-line Service". Nucleic Acids Research, 18(6):1517–1520.
- ↑ Benton D et al. (2006). "GenBank". Nucleic Acids Research, 34(Database):D16-D20.
- ↑ NCBI-GenBank Flat File Release 158.0 - Distribution Release Notes ('
gbrel.txt
') — 2007-02-15.
External links
- GenBank (overview)
- Directory containing full GenBank flat file releases (NCBI)
- Genomes (NCBI)
- List of completed eukaryotic genomes (NCBI)
- The DDBJ/EMBL/GenBank Feature Table: Definition — version 6.6, 2006-10.