BTEP: Microarray Workshop (2 day)

Microarray Workshop (2 day)

When: Sep. 22nd, 2015 - Sep. 23rd, 2015 9:30 am - 4:30 pm

To Know

Where:

Bldg10: FAES Classroom 7 ( B1C206)

Organizer:

BTEP

Presented By:

Maggie Cam (NCI CCBR), Parthav Jailwala (CCBR), Xiaowen Wang (Partek)

Class Files:

Files

This class has ended.

Video Archive *Class recordings may take ~48 hours to become available.

About this Class

Learn the basics of microarray gene expression analysis using Partek Genomics Suite and Open Source Tools. As we walk though hands-on analysis of a cancer dataset, you will learn the principles of experimental design, batch correction, statistics, and how to extract biological meaning from the results using tools geneset analyses and pathways.

PLEASE NOTE: This 2 day workshop is a BYOC (Bring your own LapTop Computer) class. Government issued or personal computers are permitted. We will be able to supply a very limited set of computers, so if you want to take the class but cannot bring your own computer please indicate such in the Comment section on the registration form.

Direction of FAES Classroom 7 (B1C206) can be found here: http://www.faes.org/announcements/directions_faes_classrooms_nih_campus

Day 1 - AM (9:30-11:30) Introductory Lecture
(Maggie Cam, PhD - CCR, NCI)

Introduction

Historical Perspective
Microarray Technologies, Sample Processing Methods
Microarray comparisons to RNA-Seq

Data Analysis

Experimental Design
QC methods
Preprocessing: Normalization and low level analysis algorithms

Statistical Analysis

Common statistical models used for analysis of microarray data
Examples of blocking
Batch effects and removal methods

Visualization and Clustering

Volcano Plot
Principal Components Analysis
Hierarchical Clustering
K-means Clustering

Validation and Downstream Analysis

Validation methods
Gene Ontology Enrichment and Pathway analysis tools
Major Software applications
Public Repositories of Microarray Data

Day 1 - PM (2:00-4:30 pm): Hands-on Gene Expression Data Analysis in Partek Genomics Suite
(Xiaowen Wang, PhD - Partek)

Attendees will learn how to use basic features of Partek Genomics Suite for the analysis on Gene Expression Data. An Affymetrix Gene Expression Data will be used to conduct Gene Expression workflow:

Import data
Perform QA/QC of imported data
Exploratory data analysis
Detect differential expression (ANOVA)
Gene list creation

Day 2 - AM (9:30-11:30): Hands-on Gene Expression Data Analysis in Partek Genomics Suite - Continued
(Xiaowen Wang, PhD - Partek)

Biological interpretation
Visualization (PCA, histogram, box plot, dot plot, volcano plot, interaction plot heatmap etc.)

Day 2 - PM (1:30-2:30): GEO2R
(Parthav Jailwala, MSc- CCBR, NCI)

GEO2R is an interactive web tool that allows users to compare two or more groups of samples in a GEO Series in order to identify genes that are differentially expressed across experimental conditions. GEO2R performs comparisons on original submitter-supplied processed data tables using the GEOquery and limma R packages from the Bioconductor project. Bioconductor is an open source software project based on the R programming language that provides tools for the analysis of high-throughput genomic data. The GEOquery R package parses GEO data into R data structures that can be used by other R packages. The limma (Linear Models for Microarray Analysis) R package has emerged as one of the most widely used statistical tests for identifying differentially expressed genes. It handles a wide range of experimental designs and data types and applies multiple-testing corrections on P-values to help correct for the occurrence of false positives. Thus, GEO2R provides a simple interface that allows users to perform R statistical analysis without command line expertise.

Lecture

Background on GEO datasets
What is GEO2R and how can it help you
How to use GEO2R
Options and features
Limitations and caveats
Hands-on exercise

Day 2 - PM (2:30-3:30): DAVID
(David/Dawei Huang, M.D. - LMB, CCR, NCI)

The Database for Annotation, Visualization and Integrated Discovery (DAVID ) provides a comprehensive set of functional annotation tools for investigators to understand biological meaning behind large list of genes.

Lecture

Brief principle of DAVID gene enrichment analysis
Term-centric analysis of a large gene list
Gene-centric analysis of a large gene list
Pathway map view of a large gene list
Nature Protocols 4:44 (http://www.nature.com/nprot/journal/v4/n1/abs/nprot.2008.211.html)

Day 2 - PM (3:30-4:30): GeneSet Enrichment Analysis (GSEA)

(Maggie Cam, PhD - CCR, NCI)

GSEA is a computational method that determines which (if any) a priori defined sets of genes are significantly differentially expressed, as an ensemble, between two biological states. It is an open-source program developed by the Broad Institute: http://www.broadinstitute.org/gsea/index.jsp

Lecture

The general approach of gene set enrichment methods and comparison with DAVID
How GSEA measures differential expression for each set of genes
Controlling effects of multiple comparisons in GSEA (false discovery rate)
The Broad Institute library of groups of gene sets (MSigDB)
What files and formats are needed for GSEA
User options and running GSEA

Hands-on

Loading the GSEA required input files for an example dataset
Using and choosing values in the GSEA GUI interface
Rank-based analysis
Full dataset analysis
Understanding the GSEA outputs and judging significance in the results

Files

david.pptx: |
demo_list.txt: |
GEO2R.pptx: |
GSEA.zip: |
GSEA_Lecture_9_2015.pptx: |
Gene20expression20training20handout-NCI_0.pdf: |
Microarray_Lecture_9_2015_0.pptx: |
license_0.zip: |
GX_training_data.zip: |

Bioinformatics Training and Education Program

Microarray Workshop (2 day)

Microarray Workshop (2 day)

To Know

About this Class

Day 1 - AM (9:30-11:30) Introductory Lecture (Maggie Cam, PhD - CCR, NCI)

Day 1 - PM (2:00-4:30 pm): Hands-on Gene Expression Data Analysis in Partek Genomics Suite (Xiaowen Wang, PhD - Partek)

Day 2 - AM (9:30-11:30): Hands-on Gene Expression Data Analysis in Partek Genomics Suite - Continued (Xiaowen Wang, PhD - Partek)

Day 2 - PM (1:30-2:30): GEO2R (Parthav Jailwala, MSc- CCBR, NCI)

Day 2 - PM (2:30-3:30): DAVID (David/Dawei Huang, M.D. - LMB, CCR, NCI)

Day 2 - PM (3:30-4:30): GeneSet Enrichment Analysis (GSEA)

(Maggie Cam, PhD - CCR, NCI)

Files

Day 1 - AM (9:30-11:30) Introductory Lecture
(Maggie Cam, PhD - CCR, NCI)

Day 1 - PM (2:00-4:30 pm): Hands-on Gene Expression Data Analysis in Partek Genomics Suite
(Xiaowen Wang, PhD - Partek)

Day 2 - AM (9:30-11:30): Hands-on Gene Expression Data Analysis in Partek Genomics Suite - Continued
(Xiaowen Wang, PhD - Partek)

Day 2 - PM (1:30-2:30): GEO2R
(Parthav Jailwala, MSc- CCBR, NCI)

Day 2 - PM (2:30-3:30): DAVID
(David/Dawei Huang, M.D. - LMB, CCR, NCI)