ncibtep@nih.gov

Bioinformatics Training and Education Program

Data Visualization with R

Data Visualization with R

 When: Apr. 5th, 2022 1:00 pm - 2:00 pm

This class has ended.
To Know
  • Where: Online Webinar
  • Organized By: BTEP
  • Presented By: Joe Wu (BTEP), Alex Emmons (BTEP)

About this Class

Welcome to the Data Visualization with R course series! Here, we hope to help you establish the foundations for generating publication quality plots in R. We will mostly be using ggplot2 (https://ggplot2.tidyverse.org/), a powerful yet easy to learn R package that will enable users to visually explore their data and / or generate publication quality figures.

This series will include 6 lessons over 6 weeks. Each lesson will be held online on Tuesdays at 1 pm. The lessons will be 1 hour in duration followed immediately by a 1-hour help session.  Registering here will register you for all 6 lessons. You do not need to register for each individual lesson. 

We are catering this course series to those with little to no experience with R. You will not need to install R on your computer for this class. Instead, we will be using R through DNAnexus, a cloud platform for bioinformatics analysis. Upon registering for the class, register for a free DNAnexus account at https://www.dnanexus.com. You will need to send your username to ncibtep@nih.gov to finish setting up your DNAnexus account for course access.

In this series, we will show you how to import data into R and subsequently generate some common plots such as scatter, histogram, bar, box and whisker, and heat map. We will also learn how to customize these plots using the grammar of graphics philosophy that ggplot2 was created under, and we will learn how to generate multi-panel figures (i.e., sub plots).

The same meeting link can be used for all 6 lessons. Meeting link: https://cbiit.webex.com/cbiit/j.php?MTID=me81c20b6f217db351033f8ecd4694550 Lesson 1, April 5, 2022: Introduction to plot types In lesson 1, we will answer the question: Why R for data visualization? In addition, we will introduce the various plot types that will be generated throughout the course and will showcase related plots that you will be able to create in the future using the foundational skills gained over the next 6 weeks. Lesson 1 will not be hands-on so no coding yet. Lesson 2, April 12, 2022: Basics of ggplot2 In lesson 2, we will focus on the basics of ggplot2, including the grammar of graphics philosophy and its application. This lesson will provide a hands on introduction to the ggplot2 syntax, geom functions, mapping and aesthetics, and plot layering. Lesson 3, April 19, 2022:  Scatter plots and ggplot2 customization In lesson 3, we will continue the discussion on the grammar of graphics, with a focus on ggplot2 plot customization including axes labels, coordinate systems, axes scales, and themes. This hands on lesson will showcase these features of plot building through the generation of increasingly complex scatter plots using data included with a base R installation as well as RNASeq data. Lesson 4, April 26, 2022: Visualizing summary statistics with histograms, bar plots, and box plots In lesson 4, we will learn to generate plots that will help with visualization of summary statistics including bar plot with error bars, histogram, as well as the box and whiskers plot. Lesson 5, May 3, 2022: Visualizing clusters with heatmaps In lesson 5, we will introduce the heatmap and dendrogram as tools for visualizing clusters in data. Lesson 6, May 10, 2022: Combining multiple plots to create a figure panel In lesson 6, we will focus on generating sub plots and multi plot figure panels using ggplot2 associated packages. This will allow us to meet any figure limitations that scientific journals may have. Course Materials: https://btep.ccr.cancer.gov/docs/data-visualization-with-r/