The Art and Science of Data Mining (1-day)
When: Jun. 19th, 2017 9:30 am - 4:00 pm
About this Class
Harvesting the Wealth of TCGA Data
The Cancer Genome Atlas (TCGA) is a large-scale study that has catalogued genomic data accumulated from more than 20 different types of cancer including mutations, copy number variation, mRNA and miRNA gene expression, and DNA methylation. Being publicly distributed, it has become a major resource for cancer researchers for target discovery, biological interpretation and assessment of the clinical impact of genes of interest. This workshop will familiarize the audience with the types of data available and analytical tools that enable end-users to easily and effectively mine TCGA data. It will provide training on two applications: (a) cBioPortal for Cancer Genomics, an open-source tool, and (b) TCGA Premier through BioDiscovery Nexus, an NCI-licensed commercial tool.
NOTE: This is a BYOC (Bring your own laptop Computer) class. Government issued or personal computers are permitted. We will be able to supply a very limited set of computers, so if you want to take the class but cannot bring your own computer, please indicate such in the Comment section on the registration form.
WORKSHOP AGENDA
- A personal computer or computing device with an Internet browser (Tested browsers: Internet Explorer 11.0 and above, Firefox 3.0 and above, Safari and Google Chrome) with Javascript enabled. A Java Runtime Environment is needed for launching the Integrative Genomics Viewer (IGV). A Vector graphic editor is necessary for visualizing and editing the SVG file of OncoPrints downloaded from the cBioPortal. Examples of software supporting SVG are Adobe Illustrator (http://www.adobe.com/products/illustrator.html) and Inkscape (http://inkscape.org/).
12:30 - 1:00 pm LUNCH BREAK
Presenter: Andrea O Hara, PhD
NCI’s site license includes unlimited access to TCGA Premier, a database of re-processed, curated and reviewed TCGA samples. Nexus Copy Number is a platform independent copy number analysis software that includes co-visualization of sequence variants and gene expression data at both the individual and population wide levels. With an easy to use visual interface, Nexus Copy Number allows for quick review and detailed analysis of population-wide copy number alterations across the entire genome. In this workshop, you will learn how to use Nexus Copy Number software to mine TCGA copy number data. The Cancer Genome Atlas (TCGA) contains various types of genomic data from a wide variety of cancers, including several rare tumor types. The training session will focus on access of the TCGA data within the software and a detailed evaluation of one TCGA data set to identify statistically significant changes within the sample population.
Learning Objectives:
* Access and integration of CNV, sequence variant and RNA-Seq expression TCGA data directly from Nexus.
* Visualization and statistical approaches for discovery.
* Sample stratification by clinical annotation factors or biomarkers.
* Finding CNVs predictive of survival or other outcome data.
* Generate publication-ready figures and charts during analysis.