ncibtep@nih.gov

Bioinformatics Training and Education Program

Introduction to Data Wrangling Using Python: Part 2 of 2

Introduction to Data Wrangling Using Python: Part 2 of 2

 When: Mar. 11th, 2026 10:00 am - 11:00 am

Learning Level: Intermediate

To Know

Where:
Online
Organizer:
NIH Library
Presented By:
Cindy Sheffield (NIH Library)

About this Class

This one-hour online training, the second session of the two-part series,  focuses on reshaping and enriching the cleaned patient dataset to prepare it for analysis and reporting. Attendees will practice splitting and recombining columns (for example, separating full names into first and last names), converting columns to appropriate data types, and engineering new fields such as outlier indicators and blood pressure status labels. The session also covers merging multiple tables (patient details, contact information, and subsets of records) and filtering or subsetting data to answer specific analytical questions.​

By the end of this training, attendees will be able to:

  • Reshape and restructure data by splitting and combining columns, changing data types, and reordering or selecting relevant fields.​
  • Engineer clinically useful features, including z-score–based outlier flags, hypertension indicators, and combined status columns for downstream models or dashboards.​
  • Merge and join DataFrames using common keys (such as patient ID) to bring together core data with supplemental tables like contact information.​
  • Filter and subset records based on multiple conditions (for example, patients with diabetes and abnormal blood pressure) to create analysis-ready datasets.​

Attendees are expected to have:

  • To have attended Intro to Data Wrangling Using Python - Part 1 of the series
  • Basic Python coding knowledge

Familiarity with an IDE and loading script and data files into the IDE. (Colab, Jupyter Notebooks) 

Requirements: 

  • Participants will receive a script file and data files prior to the training. These should be loaded and ready to use before the training session begins. 

You can register for Part 1 in this series via the link below: 

https://www.nihlibrary.nih.gov/training/introduction-data-wrangling-using-python-part-1-2