ncibtep@nih.gov

Bioinformatics Training and Education Program

Taming Messy Data: Practical R Wrangling with the Tidyverse

Taming Messy Data: Practical R Wrangling with the Tidyverse

 When: Feb. 23rd, 2026 1:00 pm - 2:30 pm

Learning Level: Intermediate

To Know

Where:
Online
Organizer:
NIH Library
Presented By:
Doug Joubert (NIH Library)

About this Class

This one and a half hour online training equips participants with powerful data wrangling techniques using R and the tidyverse ecosystem. The tidyverse is a cohesive ecosystem of R packages designed to make data science workflows more intuitive and efficient through consistent syntax and design principles. Designed for both beginners and those looking to refine their skills, this training addresses the challenges posed by messy datasets.  

By the end of this training, attendees will be able to

  • Diagnose and address common data quality issues in clinical datasets.
  • Apply systematic approaches to clean and standardize text, dates, and numerical values.
  • Transform messy data and handle missing values using tidyverse functions, including appropriate imputation strategies.
  • Design reproducible, automated data-cleaning workflows with tidyverse tools for transformation and aggregation.

Requirements 

Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following:

  • Installed R and RStudio.
  • Have a basic understanding of R and RStudio.
  • Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R