RefSeq

NCBI's Reference Sequence (RefSeq) database is a collection of taxonomically diverse, non-redundant and richly annotated sequences representing naturally occurring molecules of DNA, RNA, and protein.^[1] Included are sequences from plasmids, organelles, viruses, archaea, bacteria, and eukaryotes. Each RefSeq is constructed wholly from sequence data submitted to the International Nucleotide Sequence Database Collaboration (INSDC). Similar to a review article, a RefSeq is a synthesis of information integrated across multiple sources at a given time. RefSeqs provide a foundation for uniting sequence data with genetic and functional information. They are generated to provide reference standards for multiple purposes ranging from genome annotation to reporting locations of sequence variation in medical records. The RefSeq collection is available without restriction and can be retrieved in several different ways, such as by searching or by available links in NCBI resources, including PubMed, Nucleotide, Protein, Gene, and Map Viewer, searching with a sequence via BLAST, and downloading from the RefSeq FTP site.

Statistics

RefSeq Release 54 (11 July 2012):

Proteins: 16,393,342

Organisms: 17,605

References

↑ Kim Pruitt, Garth Brown, Tatiana Tatusova, and Donna Maglott (2002). "The Reference Sequence (RefSeq) Database". NCBI. Bookshelf ID: NBK21091. Last Update: 6 April 2012.

External links

RefSeq FTP site

[1] Kim Pruitt, Garth Brown, Tatiana Tatusova, and Donna Maglott (2002). "The Reference Sequence (RefSeq) Database". NCBI. Bookshelf ID: NBK21091. Last Update: 6 April 2012.

[1]

RefSeq

Contents

Statistics

See also

References

External links

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools