ncibtep@nih.gov

Bioinformatics Training and Education Program

Introductory R for Novices: Introduction to Data Wrangling

Introductory R for Novices: Introduction to Data Wrangling

When: June 17, 2025 - July 8, 2025
Share

About this Course

This course, designed for novices, will introduce the esssential R packages and functions often used to explore, clean, transform, and summarize data. The content for this course is similar to past introductory R courses, but the pace of the course will be much slower to benefit novices. 
 
Why learn R? R is a great resource for statistical analysis, data visualization, and report generation. R also provides packages and functions specific to the analysis of -omics data through efforts like Bioconductor.
 
This course is the second part of a larger 3-part course designed for novices:
 
Part 1: Getting Started with R 
Part 2: Introduction to Data Wrangling 
Part 3: Introduction to Data Visualization 
 
Topics covered in this course (Part 2) focus on wrangling data stored in data frames or tibbles and include concepts such as reshaping, subsetting, summarizing, mutating, and joining data. 
 
Prerequisites:
This course is recommended for attendees familiar with the skills learned in Part 1: Getting Started with R
 
Course materials:  
We will use R on Biowulf for this course to avoid issues with R and package installations. To use R on Biowulf, you must have a NIH HPC account. If you do not have a Biowulf account, this course can be taken using a local R installation. However, we will not be able to troubleshoot package installation issues during class. Additionally, because we will use packages belonging to the tidyverse, you will need to install these packages using install.packages("tidyverse") prior to the first lesson if you are not using R on Biowulf.  
Description

This lesson will introduce the philosophy of tidy data and key concepts and packages used for data wrangling with R. 

This lesson will introduce the philosophy of tidy data and key concepts and packages used for data wrangling with R. 

Details
When
Tue, Jun 17, 2025 - 2:00 pm - 3:00 pm
Where
Online
Description

In this lesson, we will learn how to tidy messy data using functions from the tidyverse package, tidyr.  The primary focus will be on reshaping data from wide to long format or vice versa.  

In this lesson, we will learn how to tidy messy data using functions from the tidyverse package, tidyr.  The primary focus will be on reshaping data from wide to long format or vice versa.  

Details
When
Tue, Jun 24, 2025 - 2:00 pm - 3:00 pm
Where
Online
Description

This lesson will introduce the tidyverse package, dplyr. Attendees will primarily learn how to filter rows and select columns from data frames.

 

This lesson will introduce the tidyverse package, dplyr. Attendees will primarily learn how to filter rows and select columns from data frames.

 

Details
When
Tue, Jul 01, 2025 - 2:00 pm - 3:00 pm
Where
Online
Description

This lesson will introduce the "split-apply-combine" approach to data analysis and the key players in the dplyr package used to implement this type of workflow.  

This lesson will introduce the "split-apply-combine" approach to data analysis and the key players in the dplyr package used to implement this type of workflow.  

Details
When
Tue, Jul 08, 2025 - 2:00 pm - 3:00 pm
Where
Online
Description

This is the final lesson in the course Introductory R for Novices: Introduction to Data Wrangling. This lesson will show attendees how to join multiple data frames and transform and create new variables using dplyr.

This is the final lesson in the course Introductory R for Novices: Introduction to Data Wrangling. This lesson will show attendees how to join multiple data frames and transform and create new variables using dplyr.

Details
When
Tue, Jul 15, 2025 - 2:00 pm - 3:00 pm
Where
Online