Bioinformatics Training and Education Program

Exome-Seq Data Analysis Workshop (2-day)

Exome-Seq Data Analysis Workshop (2-day)

 When: Mar. 18th, 2015 - Mar. 19th, 2015 9:30 pm - 4:30 pm

To Know

Presented By:
Sohela Shah (Ingenuity)
Class Files:
This class has ended.

About this Class

This workshop will cover basics of exome-seq analysis including downstream interpretation of variants using a variety of open-source and commercial webtools (Golden Helix, IGV, Ingenuity Variant Analysis, GeneGrid (Genomatix), MuPit/Cravat).

Day 1 - AM (9:30-12:30) Introductory Lectures
(Chunhua Yan, PhD - CBIIT)

  • Next generation sequencing technology
  • Exome sequencing (Cost, Speed, Gene coverage, Biological implication)
  • Experimental design (Sample size, Coverage, Sample submission)
  • Mutation Calling (Dream challenge, Genome in Bottle)

(Chih-Hao Hsu, PhD - CBIIT)

  • VCF
  • Visualization
  • Mutation call software overview and algorithms
  • Databases (1000 genomes, ClinVar, cBio, …)

(Li Jia, MSc - CCBR)

  • Lessons learned from experimental design
  • Best practices in CCBR workflow (includes the discussion on the benchmark, GATK and others used in the tech dev)
  • Annovar annotation and filtering
  • How to collaborate with CCBR – guide to success

Day 1 - PM  (1:30-4:30) 

Golden Helix
(Bryce Christensen PhD - Golden Helix)

Cancer gene panel analysis
Whole exome Tumor/normal analysis
Whole exome trio analysis
Whole exome extended family analysis with PhoRank
Population based NGS workflows including collapsing methods
Integrative Genomics Viewer (IGV)
(Online Tutorial: self-guided)
Click here to view the Tutorial    (needs VPN, for NIH only)
The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations.
Visualizing variant (VCF) and alignment (BAM) files using IGV

Day 2 - AM  (9:30-12:30)
(Susan Dombrowski, PhD - Genomatix)

Genomic variants like SNPs or small InDels are of major interest to biologists and clinicians alike. Identification of the relevant variants within a genome is crucial for the understanding of molecular mechanisms and diagnostics of rare or common diseases.
GeneGrid enables you to reduce the millions of variants generated by today's NGS experiments to the few or even the single relevant one(s) with a few clicks and generate a detailed report of the findings. Variants of interest and their associated alignment files can be visualized in the context of Genomatix' curated genomic data content, and literature and pathway analysis of variants of interest can also be performed within the same application.  In this session, a publicly-available cancer exome-seq dataset (normal/tumor) will be used as a case-study to showcase the features and functionality of GeneGrid for use in clinical studies.
Ingenuity Variant Analysis
(Sohela Shah, PhD - Ingenuity
Ingenuity Variant Analysis combines analytical tools and integrated content to help you rapidly identify and prioritize variants by drilling down to a small, targeted subset of compelling variants based both upon published biological evidence and your own knowledge of disease biology. With Variant Analysis, you can interrogate your variants from multiple biological perspectives, explore different biological hypotheses, and identify the most promising variants for follow-up.
QIAGEN Ingenuity Variant Analysis training will include uploading, annotating, and searching samples, and setting up, reviewing, and exportinganalyses. We will review the different filter settings, particularly focusing on the genetic analysis and statistical association.

Day 2 - PM  (1:30-4:30) CRAVAT/MuPIT - Analysis of Genomic Variants
(Michael Ryan  - Johns Hopkins University)

CRAVAT ( is a free tool for high-throughput analysis of sequencing variants.  CRAVAT is funded by NCI’s Informatics Technology for Cancer Research program.  CRAVAT accepts very large variant data files and returns a wide variety of annotations and scores that help with identification of important variants.  CRAVAT is a cancer focused analysis package tailored to the needs of cancer studies.  The workshop will provide some background on CRAVAT and lots of hands-on exercises to learn how to use the tool and interpret the results.

MuPIT (mupit.icm.jhu) is a sister tool to CRAVAT that shows mutations on 3D protein structures.  Clusters of mutations in 3D space are not always apparent from the position of mutations on a protein sequence.  For proteins with solved structures, MuPIT can show the position of mutations from your study along with a variety of structural annotations (e.g. the position of a DNA binding site).  MuPIT also includes a pre-built database of TCGA mutations so an investigator’s mutations can be viewed in the context of mutations and mutation clusters from other cancer studies.  The focus of the workshop will be a series of exercises to learn how to visualize your mutations in MuPIT, how CRAVAT and MuPIT are integrated, and how to manipulate, investigate, and understand the results.


  • Advanced20Control20of20CRAVAT20and20MuPIT.pptx: |
  • CRAVAT_Workshop.pptx: |
  • MutationCalling.pptx: |
  • BTEP_exome-seq_03182015.pdf: |
  • Exome_BTEP_Mar18_2015.pptx: |
  • Data20Analysis20for20Exome20Sequencing20Data_0318.ppt: |