Difference between revisions of "Guide to Analysis of DNA Microarray Data"
From Christoph's Personal Wiki
(→Table of Contents: format) |
(→Table of Contents: format) |
||
Line 8: | Line 8: | ||
== Table of Contents == | == Table of Contents == | ||
− | * Preface | + | * Preface |
− | * Acknowledgments | + | * Acknowledgments |
− | * 1. Introduction to DNA Microarray Technology | + | * 1. Introduction to DNA Microarray Technology |
** 1.1 Hybridization. | ** 1.1 Hybridization. | ||
** 1.2 Gold Rush? | ** 1.2 Gold Rush? | ||
− | ** 1.3 The Technology behind DNA Microarrays | + | ** 1.3 The Technology behind DNA Microarrays |
− | *** 1.3.1 Affymetrix GeneChip Technology | + | *** 1.3.1 Affymetrix GeneChip Technology |
− | *** 1.3.2 Spotted Arrays | + | *** 1.3.2 Spotted Arrays |
− | *** 1.3.3 Digital Micromirror Arrays | + | *** 1.3.3 Digital Micromirror Arrays |
− | *** 1.3.4 Inkjet Arrays | + | *** 1.3.4 Inkjet Arrays |
− | *** 1.3.5 Bead Arrays | + | *** 1.3.5 Bead Arrays |
− | *** 1.3.6 Serial Analysis of Gene Expression (SAGE) | + | *** 1.3.6 Serial Analysis of Gene Expression (SAGE) |
− | ** 1.4 Parallel Sequencing on Microbead Arrays | + | ** 1.4 Parallel Sequencing on Microbead Arrays |
− | *** 1.4.1 Emerging Technologies | + | *** 1.4.1 Emerging Technologies |
− | ** 1.5 Example: Affymetrix vs. Spotted Arrays | + | ** 1.5 Example: Affymetrix vs. Spotted Arrays |
− | ** 1.6 Summary | + | ** 1.6 Summary |
− | ** 1.7 Further Reading | + | ** 1.7 Further Reading |
− | * 2. Overview of Data Analysis | + | * 2. Overview of Data Analysis |
− | * 3. Image Analysis | + | * 3. Image Analysis |
− | ** 3.1 Gridding | + | ** 3.1 Gridding |
− | ** 3.2 Segmentation | + | ** 3.2 Segmentation |
− | ** 3.3 Intensity Extraction | + | ** 3.3 Intensity Extraction |
− | ** 3.4 Background Correction | + | ** 3.4 Background Correction |
− | ** 3.5 Software | + | ** 3.5 Software |
− | *** 3.5.1 Free Software for Array Image Analysis | + | *** 3.5.1 Free Software for Array Image Analysis |
− | *** 3.5.2 Commercial Software for Array Image Analysis | + | *** 3.5.2 Commercial Software for Array Image Analysis |
− | ** 3.6 Summary | + | ** 3.6 Summary |
− | ** 3.7 Further Reading | + | ** 3.7 Further Reading |
− | * 4. Basic Data Analysis | + | * 4. Basic Data Analysis |
− | ** 4.1 Normalization | + | ** 4.1 Normalization |
− | *** 4.1.1 One or More Genes are Assumed Expressed at Constant Rate | + | *** 4.1.1 One or More Genes are Assumed Expressed at Constant Rate |
− | *** 4.1.2 Sum of Genes is Assumed Constant | + | *** 4.1.2 Sum of Genes is Assumed Constant |
− | *** 4.1.3 Subset of Genes is Assumed Constant | + | *** 4.1.3 Subset of Genes is Assumed Constant |
− | *** 4.1.4 Majority of Genes Assumed Constant | + | *** 4.1.4 Majority of Genes Assumed Constant |
− | *** 4.1.5 Spike Controls | + | *** 4.1.5 Spike Controls |
− | ** 4.2 Dye Bias, Spatial Bias, Print Tip Bias | + | ** 4.2 Dye Bias, Spatial Bias, Print Tip Bias |
− | ** 4.3 Expression Indices | + | ** 4.3 Expression Indices |
− | *** 4.3.1 Average Difference | + | *** 4.3.1 Average Difference |
− | *** 4.3.2 Signal | + | *** 4.3.2 Signal |
− | *** 4.3.3 Model-Based Expression Index | + | *** 4.3.3 Model-Based Expression Index |
− | *** 4.3.4 Robust Multiarray Average | + | *** 4.3.4 Robust Multiarray Average |
− | *** 4.3.5 Position Dependent Nearest Neighbor Model | + | *** 4.3.5 Position Dependent Nearest Neighbor Model |
− | ** 4.4 Detection of Outliers | + | ** 4.4 Detection of Outliers |
− | ** 4.5 Fold Change | + | ** 4.5 Fold Change |
− | ** 4.6 Significance | + | ** 4.6 Significance |
− | *** 4.6.1 Multiple Conditions | + | *** 4.6.1 Multiple Conditions |
− | *** 4.6.2 Nonparametric Tests | + | *** 4.6.2 Nonparametric Tests |
− | *** 4.6.3 Correction for Multiple Testing | + | *** 4.6.3 Correction for Multiple Testing |
− | *** 4.6.4 Example I: t-Test and ANOVA | + | *** 4.6.4 Example I: t-Test and ANOVA |
− | *** 4.6.5 Example II: Number of Replicates | + | *** 4.6.5 Example II: Number of Replicates |
− | ** 4.7 Mixed Cell Populations | + | ** 4.7 Mixed Cell Populations |
− | ** 4.8 Summary | + | ** 4.8 Summary |
− | ** 4.9 Further Reading | + | ** 4.9 Further Reading |
− | * 5. Visualization by Reduction of Dimensionality | + | * 5. Visualization by Reduction of Dimensionality |
− | ** 5.1 Principal Component Analysis | + | ** 5.1 Principal Component Analysis |
− | ** 5.2 Example 1: PCA on Small Data Matrix | + | ** 5.2 Example 1: PCA on Small Data Matrix |
− | ** 5.3 Example 2: PCA on Real Data | + | ** 5.3 Example 2: PCA on Real Data |
− | ** 5.4 Summary | + | ** 5.4 Summary |
− | ** 5.5 Further Reading | + | ** 5.5 Further Reading |
− | * 6. Cluster Analysis | + | * 6. Cluster Analysis |
− | ** 6.1 Hierarchical Clustering | + | ** 6.1 Hierarchical Clustering |
− | ** 6.2 K-means Clustering | + | ** 6.2 K-means Clustering |
− | ** 6.3 Self-Organizing Maps | + | ** 6.3 Self-Organizing Maps |
− | ** 6.4 Distance Measures | + | ** 6.4 Distance Measures |
− | *** 6.4.1 Example: Comparison of Distance Measures | + | *** 6.4.1 Example: Comparison of Distance Measures |
− | ** 6.5 Gene Normalization | + | ** 6.5 Gene Normalization |
− | ** 6.6 Visualization of Clusters | + | ** 6.6 Visualization of Clusters |
− | *** 6.6.1 Example: Visualization of Gene Clusters in Bladder Cancer | + | *** 6.6.1 Example: Visualization of Gene Clusters in Bladder Cancer |
− | ** 6.7 Summary | + | ** 6.7 Summary |
− | ** 6.8 Further Reading | + | ** 6.8 Further Reading |
− | * 7. Beyond Cluster Analysis | + | * 7. Beyond Cluster Analysis |
− | ** 7.1 Function Prediction | + | ** 7.1 Function Prediction |
− | ** 7.2 Discovery of Regulatory Elements in Promoter Regions | + | ** 7.2 Discovery of Regulatory Elements in Promoter Regions |
− | *** 7.2.1 Example 1: Discovery of Proteasomal Element | + | *** 7.2.1 Example 1: Discovery of Proteasomal Element |
− | *** 7.2.2 Example 2: Rediscovery of Mlu Cell Cycle Box (MCB) | + | *** 7.2.2 Example 2: Rediscovery of Mlu Cell Cycle Box (MCB) |
− | ** 7.3 Summary | + | ** 7.3 Summary |
− | ** 7.4 Further Reading | + | ** 7.4 Further Reading |
− | * 8. Automated Analysis, Integrated Analysis and Systems Biology | + | * 8. Automated Analysis, Integrated Analysis and Systems Biology |
− | ** 8.1 Integrated Analysis | + | ** 8.1 Integrated Analysis |
− | ** 8.2 Systems Biology | + | ** 8.2 Systems Biology |
− | ** 8.3 Further Reading | + | ** 8.3 Further Reading |
− | * 9. Reverse Engineering of Regulatory Networks | + | * 9. Reverse Engineering of Regulatory Networks |
− | ** 9.1 The Time-Series Approach | + | ** 9.1 The Time-Series Approach |
− | ** 9.2 The Steady-State Approach | + | ** 9.2 The Steady-State Approach |
− | ** 9.3 Limitations of Network Modeling | + | ** 9.3 Limitations of Network Modeling |
− | ** 9.4 Example 1: Steady-State Model | + | ** 9.4 Example 1: Steady-State Model |
− | ** 9.5 Example 2: Steady-State Model on Bacillus Data | + | ** 9.5 Example 2: Steady-State Model on ''Bacillus'' Data |
− | ** 9.6 Example 3: Linear Time-Series Model | + | ** 9.6 Example 3: Linear Time-Series Model |
− | ** 9.7 Further Reading | + | ** 9.7 Further Reading |
− | * 10. Molecular Classifiers | + | * 10. Molecular Classifiers |
− | ** 10.1 Feature Selection | + | ** 10.1 Feature Selection |
− | ** 10.2 Validation | + | ** 10.2 Validation |
− | *** 10.3.1 Nearest Neighbor | + | *** 10.3.1 Nearest Neighbor |
− | *** 10.3.2 Nearest Centroid | + | *** 10.3.2 Nearest Centroid |
− | *** 10.3.3 Neural Networks | + | *** 10.3.3 Neural Networks |
− | *** 10.3.4 Support Vector Machine | + | *** 10.3.4 Support Vector Machine |
− | ** 10.4 Performance Evaluation | + | ** 10.4 Performance Evaluation |
− | ** 10.5 Example I: Classification of Bladder Cancer Subtypes | + | ** 10.5 Example I: Classification of Bladder Cancer Subtypes |
− | ** 10.6 Example II: Classification of SRBCT Cancer Subtypes | + | ** 10.6 Example II: Classification of SRBCT Cancer Subtypes |
− | ** 10.7 Summary | + | ** 10.7 Summary |
− | ** 10.8 Further Reading | + | ** 10.8 Further Reading |
− | * 11. The Design of Probes for Arrays | + | * 11. The Design of Probes for Arrays |
− | ** 11.1 Selection of Genes for an Array | + | ** 11.1 Selection of Genes for an Array |
− | ** 11.2 Gene Finding | + | ** 11.2 Gene Finding |
− | ** 11.3 Selection of Regions Within Genes | + | ** 11.3 Selection of Regions Within Genes |
− | ** 11.4 Selection of Primers for PCR | + | ** 11.4 Selection of Primers for PCR |
− | *** 11.4.1 Example: Finding PCR Primers for Gene AF105374 | + | *** 11.4.1 Example: Finding PCR Primers for Gene AF105374 |
− | ** 11.5 Selection of Unique Oligomer Probes | + | ** 11.5 Selection of Unique Oligomer Probes |
− | ** 11.6 Remapping of Probes | + | ** 11.6 Remapping of Probes |
− | ** 11.7 Further Reading | + | ** 11.7 Further Reading |
− | * 12. Genotyping and Resequencing Chips | + | * 12. Genotyping and Resequencing Chips |
− | ** 12.1 Example: Neural Networks for GeneChip Prediction | + | ** 12.1 Example: Neural Networks for GeneChip Prediction |
− | ** 12.2 Further Reading | + | ** 12.2 Further Reading |
− | * 13. Experiment Design and Interpretation of Results | + | * 13. Experiment Design and Interpretation of Results |
− | ** 13.1 Factorial Designs | + | ** 13.1 Factorial Designs |
− | ** 13.2 Designs for Two-Channel Arrays | + | ** 13.2 Designs for Two-Channel Arrays |
− | ** 13.3 Hypothesis Driven Experiments | + | ** 13.3 Hypothesis Driven Experiments |
− | ** 13.4 Independent Verification | + | ** 13.4 Independent Verification |
− | ** 13.5 Interpretation of Results | + | ** 13.5 Interpretation of Results |
− | ** 13.6 Limitations of Expression Analysis | + | ** 13.6 Limitations of Expression Analysis |
− | *** 13.6.1 Relative Versus Absolute RNA Quantification | + | *** 13.6.1 Relative Versus Absolute RNA Quantification |
− | ** 13.7 Further Reading | + | ** 13.7 Further Reading |
− | * 14. Software Issues and Data Formats | + | * 14. Software Issues and Data Formats |
− | ** 14.1 Standardization Efforts | + | ** 14.1 Standardization Efforts |
− | ** 14.2 Databases | + | ** 14.2 Databases |
− | ** 14.3 Standard File Format | + | ** 14.3 Standard File Format |
− | ** 14.4 Software for Clustering | + | ** 14.4 Software for Clustering |
− | *** 14.4.1 Example: Clustering with ClustArray | + | *** 14.4.1 Example: Clustering with ClustArray |
− | ** 14.5 Software for Statistical Analysis | + | ** 14.5 Software for Statistical Analysis |
− | *** 14.5.1 Example: Statistical Analysis with R | + | *** 14.5.1 Example: Statistical Analysis with R |
− | *** 14.5.2 The Affy Package of Bioconductor | + | *** 14.5.2 The Affy Package of Bioconductor |
− | *** 14.5.3 Commercial Statistics Packages | + | *** 14.5.3 Commercial Statistics Packages |
− | ** 14.6 Summary | + | ** 14.6 Summary |
− | ** 14.7 Further Reading | + | ** 14.7 Further Reading |
− | * Appendix A: Web Resources: Commercial Software Packages | + | * Appendix A: Web Resources: Commercial Software Packages |
− | * References | + | * References |
− | * Index | + | * Index |
== Keywords == | == Keywords == |
Latest revision as of 12:34, 5 January 2006
Guide to Analysis of DNA Microarray Data (ISBN 047165604) by Steen Knudsen.
Bibliography
- Paperback: 184 pages
- Publisher: Wiley-Liss; 2nd Edition; 2 March 2004
- Language: English
- ISBN: 0471656046
Table of Contents
- Preface
- Acknowledgments
- 1. Introduction to DNA Microarray Technology
- 1.1 Hybridization.
- 1.2 Gold Rush?
- 1.3 The Technology behind DNA Microarrays
- 1.3.1 Affymetrix GeneChip Technology
- 1.3.2 Spotted Arrays
- 1.3.3 Digital Micromirror Arrays
- 1.3.4 Inkjet Arrays
- 1.3.5 Bead Arrays
- 1.3.6 Serial Analysis of Gene Expression (SAGE)
- 1.4 Parallel Sequencing on Microbead Arrays
- 1.4.1 Emerging Technologies
- 1.5 Example: Affymetrix vs. Spotted Arrays
- 1.6 Summary
- 1.7 Further Reading
- 2. Overview of Data Analysis
- 3. Image Analysis
- 3.1 Gridding
- 3.2 Segmentation
- 3.3 Intensity Extraction
- 3.4 Background Correction
- 3.5 Software
- 3.5.1 Free Software for Array Image Analysis
- 3.5.2 Commercial Software for Array Image Analysis
- 3.6 Summary
- 3.7 Further Reading
- 4. Basic Data Analysis
- 4.1 Normalization
- 4.1.1 One or More Genes are Assumed Expressed at Constant Rate
- 4.1.2 Sum of Genes is Assumed Constant
- 4.1.3 Subset of Genes is Assumed Constant
- 4.1.4 Majority of Genes Assumed Constant
- 4.1.5 Spike Controls
- 4.2 Dye Bias, Spatial Bias, Print Tip Bias
- 4.3 Expression Indices
- 4.3.1 Average Difference
- 4.3.2 Signal
- 4.3.3 Model-Based Expression Index
- 4.3.4 Robust Multiarray Average
- 4.3.5 Position Dependent Nearest Neighbor Model
- 4.4 Detection of Outliers
- 4.5 Fold Change
- 4.6 Significance
- 4.6.1 Multiple Conditions
- 4.6.2 Nonparametric Tests
- 4.6.3 Correction for Multiple Testing
- 4.6.4 Example I: t-Test and ANOVA
- 4.6.5 Example II: Number of Replicates
- 4.7 Mixed Cell Populations
- 4.8 Summary
- 4.9 Further Reading
- 4.1 Normalization
- 5. Visualization by Reduction of Dimensionality
- 5.1 Principal Component Analysis
- 5.2 Example 1: PCA on Small Data Matrix
- 5.3 Example 2: PCA on Real Data
- 5.4 Summary
- 5.5 Further Reading
- 6. Cluster Analysis
- 6.1 Hierarchical Clustering
- 6.2 K-means Clustering
- 6.3 Self-Organizing Maps
- 6.4 Distance Measures
- 6.4.1 Example: Comparison of Distance Measures
- 6.5 Gene Normalization
- 6.6 Visualization of Clusters
- 6.6.1 Example: Visualization of Gene Clusters in Bladder Cancer
- 6.7 Summary
- 6.8 Further Reading
- 7. Beyond Cluster Analysis
- 7.1 Function Prediction
- 7.2 Discovery of Regulatory Elements in Promoter Regions
- 7.2.1 Example 1: Discovery of Proteasomal Element
- 7.2.2 Example 2: Rediscovery of Mlu Cell Cycle Box (MCB)
- 7.3 Summary
- 7.4 Further Reading
- 8. Automated Analysis, Integrated Analysis and Systems Biology
- 8.1 Integrated Analysis
- 8.2 Systems Biology
- 8.3 Further Reading
- 9. Reverse Engineering of Regulatory Networks
- 9.1 The Time-Series Approach
- 9.2 The Steady-State Approach
- 9.3 Limitations of Network Modeling
- 9.4 Example 1: Steady-State Model
- 9.5 Example 2: Steady-State Model on Bacillus Data
- 9.6 Example 3: Linear Time-Series Model
- 9.7 Further Reading
- 10. Molecular Classifiers
- 10.1 Feature Selection
- 10.2 Validation
- 10.3.1 Nearest Neighbor
- 10.3.2 Nearest Centroid
- 10.3.3 Neural Networks
- 10.3.4 Support Vector Machine
- 10.4 Performance Evaluation
- 10.5 Example I: Classification of Bladder Cancer Subtypes
- 10.6 Example II: Classification of SRBCT Cancer Subtypes
- 10.7 Summary
- 10.8 Further Reading
- 11. The Design of Probes for Arrays
- 11.1 Selection of Genes for an Array
- 11.2 Gene Finding
- 11.3 Selection of Regions Within Genes
- 11.4 Selection of Primers for PCR
- 11.4.1 Example: Finding PCR Primers for Gene AF105374
- 11.5 Selection of Unique Oligomer Probes
- 11.6 Remapping of Probes
- 11.7 Further Reading
- 12. Genotyping and Resequencing Chips
- 12.1 Example: Neural Networks for GeneChip Prediction
- 12.2 Further Reading
- 13. Experiment Design and Interpretation of Results
- 13.1 Factorial Designs
- 13.2 Designs for Two-Channel Arrays
- 13.3 Hypothesis Driven Experiments
- 13.4 Independent Verification
- 13.5 Interpretation of Results
- 13.6 Limitations of Expression Analysis
- 13.6.1 Relative Versus Absolute RNA Quantification
- 13.7 Further Reading
- 14. Software Issues and Data Formats
- 14.1 Standardization Efforts
- 14.2 Databases
- 14.3 Standard File Format
- 14.4 Software for Clustering
- 14.4.1 Example: Clustering with ClustArray
- 14.5 Software for Statistical Analysis
- 14.5.1 Example: Statistical Analysis with R
- 14.5.2 The Affy Package of Bioconductor
- 14.5.3 Commercial Statistics Packages
- 14.6 Summary
- 14.7 Further Reading
- Appendix A: Web Resources: Commercial Software Packages
- References
- Index
Keywords
vector angle distance, patient axes, variance between replicates, positive regulatory effect, gene expression data, negative regulatory effect, microarray data, spotted array, probe pairs, gene finding, fold change, reference pool, functional annotation, microarray experiments, nonlinear data, four genes, genetic network, patient category, expression array, six replicates, interaction matrix, first principal component, regulatory networks, function prediction, distance matrix