Exome Project

From Christoph's Personal Wiki
Revision as of 23:43, 14 July 2012 by Christoph (Talk | contribs) (See also)

Jump to: navigation, search

The Exome Project

The National Heart, Lung, and Blood Institute (NHLBI) and National Human Genome Research Institute (NHGRI) have funded a new program known as the Exome Project. The goal of this project is to develop cost-effective, high-throughput sequencing of the protein coding regions of the human genome for application in well-phenotyped populations. Three groups are currently funded to test and implement approaches in four key areas — sample preparation, target capture, sequencing, and data management and analysis — to generate an integrated resequencing pipeline with the potential to reduce the cost of exome analysis.[1]
ABSTRACT: Exome sequencing — the targeted sequencing of the subset of the human genome that is protein coding — is a powerful and cost-effective new tool for dissecting the genetic basis of diseases and traits that have proved to be intractable to conventional gene-discovery strategies. Over the past 2 years, experimental and analytical approaches relating to exome sequencing have established a rich framework for discovering the genes underlying unsolved Mendelian disorders. Additionally, exome sequencing is being adapted to explore the extent to which rare alleles explain the heritability of complex diseases and healthrelated traits. These advances also set the stage for applying exome and whole-genome sequencing to facilitate clinical diagnosis and personalized disease-risk profiling.[2]

Background

The exome is the part of the genome formed by exons, the coding portions of genes that are expressed. Providing the genetic blueprint used in the synthesis of proteins and other functional gene products, the exome is the most functionally relevant part of the genome, and, therefore, the most likely to contribute to the phenotype of an organism. The exome of the human genome consists of roughly 180,000 exons constituting about 1% of the total genome or about 30 megabases of DNA.[3] Though comprising a very small fraction of the genome, mutations in the exome are thought to harbor 85% of disease-causing mutations.[4] Exome sequencing has proved to be an efficient strategy to determine the genetic basis of more than a two dozen Mendelian or single gene disorders.[2]

Examples of research projects using exome sequencing include the nonprofit Personal Genome Project (PGP), the NIH-funded Exome Project, the NHGRI-funded Mendelian Exome Project, the NHLBI Grand Opportunity Exome Sequencing Project and the microarray-based Nimblegen SeqCap EZ Exome from Roche Applied Science.

Current Exome Project Participants

  • Broad Institute
    • Stacey Gabriel
    • Chad Nusbaum
  • Harvard Medical School
    • George Church
    • Jonathan Seidman
    • Kun Zhang
  • University of Washington
    • Deborah Nickerson
    • Jay Shendure
    • Phil Green
    • Evan Eichler
  • NHLBI
    • Weiniu Gan
    • Alan Michelson
    • Deborah Applebaum-Bowden
  • NHGRI
    • Lu Wang

Glossary

Mendelian disorders 
Phenotypes caused by a mutation (or mutations) in a single gene and inherited in a dominant, recessive or X-linked pattern.
Penetrance 
The proportion of individuals with a specific phenotype among carriers of a particular genotype.
Locus heterogeneity 
The appearance of phenotypically similar characteristics resulting from mutations at different genetic loci. Differences in effect size or in replication between studies and samples are often ascribed to different loci leading to the same disease.
Genome-wide association studies (GWASs)
Studies that search for a population association between a phenotype and a particular allele by screening loci (most commonly by genotyping SNPs) across the entire genome.
Complex traits 
Traits that are influenced by the environment and/or through a combination of variants in at least several genes, each of which has a small effect.
Heritability 
The proportion of the total phenotypic variation in a given characteristic that can be attributed to additive genetic effects.
Next-generation DNA 
sequencing Highly parallelized DNA-sequencing technologies that produce many hundreds of thousands or millions of short reads (25–500 bp) for a low cost and in a short time.
Exome 
The subset of a genome that is protein coding. In addition to the exome, commercially available capture probes target non-coding exons, sequences flanking exons and microRNAs.
Homozygosity mapping 
Narrowing down the location of a gene underlying a trait by searching for regions of the genome in which both chromosomal segments are inherited identicallyby-descent.
sequence depth 
for a given genome, each base has on average been sequenced n number of times:
Coverage = (Nb of Reads)*(Read Length) / (Genome Size)
Sequencing depth represents the (often average) number of nucleotides contributing to a portion of an assembly. On a genome basis, it means that, on average, each base has been sequenced a certain number of times (10X, 20X...). For a specific nucleotide, it represents the number of sequences that added information about that nucleotide. Such depth varies quite a lot depending on the genomic region. In consequence, an average sequencing depth of 30X leaves a lot of small portions of a genome un-sequenced while other receive a lot more sequences.
coverage 
appears to have 3 meanings:
  1. the theoretical "fold-coverage" of a shotgun sequencing experiment: number of reads * read length / target size
  2. the theoretical or empirical "breadth-of-coverage" of an assembly: assembly size / target size
  3. the empirical average "depth-of-coverage" of an assembly: number of reads * read length / assembly size
(1) and (3) are not the same because of sequencing error and un-clonable/un-mappable regions of the genome. Lander-Waterman theory deals with the relationship between (1) and (2).
see: here for more info.
monogenic 
simple and rare diseases
multigenic 
complex and common diseases

See also

References

  1. The Exome Project — from the Genome Sciences Dept. at the University of Washington.
  2. 2.0 2.1 Bamshad, MJ; Ng SB, Bigham AW, Tabor HK, Emond MJ, Nickerson DA, Shendure J (27 September 2011). "Exome sequencing as a tool for Mendelian disease gene discovery". Nature Reviews Genetics. 11(12): 745-755. PMID 21946919. DOI:10.1038/nrg3031 .
  3. Ng, SB; Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, Shaffer T, Wong M, Bhattacharjee A, Eichler EE, Bamshad M, Nickerson DA, Shendure J. (9/10/2009). "Targeted capture and massively parallel sequencing of 12 human exomes". Nature, 7261(461): 272-276. DOI:10.1038/nature08250 .
  4. Choia, M; Scholl UI, Ji W, Liu T, Tikhonova IR, Zumbo P, Nayir A, Bakkaloğlu A, Ozen S, Sanjad S, Nelson-Williams C, Farhi A, Mane S, Lifton RP (10 November 2009). "Genetic diagnosis by whole exome capture and massively parallel DNA sequencing". PNAS, 45(106): 19096-19101. DOI:10.1073/pnas.0910672106 .

Further reading

  • Chakravarti A (2011). "Genomic contributions to Mendelian disease". Genome Res. 21: 643-644. DOI:10.1101/gr.123554.111 .

External links