Skip to content

Course Overview

Welcome to the Data Visualization with R Series

A series of lessons designed to introduce learners to the R package ggplot2

This course will include a series of lessons for scientists with beginner level experience in R. The primary purpose of this course is to introduce learners to data visualization in R with a primary focus on ggplot2 and related packages.

Course objectives

  1. Learn how to generate basic plot types in ggplot2
  2. Understand how basic plot types can be customized to generate more complex plots

What this course is not!

This course will not:

  1. Make anyone an R expert
  2. Make anyone a ggplot2 expert

Note

While this course may be useful to learners with intermediate R experience who would like to learn more regarding ggplot2, the pace of the course will be set assuming a beginner level of understanding.

Course Expectations

This course will include a series of six, 1-1.25 hour lessons over a period of three weeks. Each lesson will be followed by a 45 minute help session in which students can ask questions and / or get individual help with their data.

Lesson 1: Introduction to plot types

This lesson will answer that burning question: Why R for data visualization? In addition, we will introduce the various plot types that will be generated throughout the course and will showcase related plots that you will be able to create in the future using the foundational skills gained over the next 3 weeks. This will not be a hands-on lesson so no coding yet. The hands-on portion of this series will start with lesson 2. In the help session afterwards we will help those having trouble with DNAnexus accounts.

Lesson 2: Getting Started with ggplot2

Lesson 2 will focus on the basics of ggplot2, including the grammar of graphics philosophy and its application. This lesson will provide a hands on introduction to the ggplot2 syntax, geom functions, mapping and aesthetics, and plot layering.

Lesson 3: Scatter plots and Non-data elements of ggplot2 customization

Lesson 3 will continue the discussion on the grammar of graphics, with a focus on ggplot2 plot customization including axes labels, coordinate systems, axes scales, and themes. This hands on lesson will showcase these features of plot building through the generation of increasingly complex scatter plots using data included with a base R installation as well as RNASeq data.

Lesson 4: Visualizing summary statistics with ggplot2

It is common to obtain summary statistics for a dataset to understand parameters like mean, standard deviation, and distribution. In this lesson, we will learn to generate plots that will help with visualization of summary statistics including bar plot with error bars, histogram, as well as the box and whiskers plot.

Lesson 5: Visualizing clusters with heatmaps

Lesson 5 will introduce the heatmap and dendrogram as tools for visualizing clusters in data. This lesson will primarily use the R package pheatmap.

Lesson 6: Combining multiple plots to create a figure panel

Scientific journals almost always have limits on the number of figures that can be included in a publication. Don't fret, in lesson 6, we will focus on generating sub plots and multi plot figure panels using ggplot2 associated packages.

Required Course Materials

To participate in this class you will need your government-issued computer and a reliable internet connection. You do not need to download or install any software to participate in the class.