ncibtep@nih.gov

Bioinformatics Training and Education Program

Classes & Events

class_id details description start_date Venues learning_levels Topic Tags delivery_method presenters Organizer seminar_series class_title
1836
Organized By:
CBIIT
Description

Learn how to visualize sequencing and analysis results effectively.

This session describes the application of the web-based interactive OmicCircos in R Shiny to construct circular plots with desired biological features. Example data from human and mouse genomes will be used to demonstrate over thirty plot functions along with the color selection, annotation, labeling, and zoom capabilities. User-guide, take-home video and sample plots from publications will be provided.

Learn how to visualize sequencing and analysis results effectively.

This session describes the application of the web-based interactive OmicCircos in R Shiny to construct circular plots with desired biological features. Example data from human and mouse genomes will be used to demonstrate over thirty plot functions along with the color selection, annotation, labeling, and zoom capabilities. User-guide, take-home video and sample plots from publications will be provided.

Learn how to visualize sequencing and analysis results effectively. This session describes the application of the web-based interactive OmicCircos in R Shiny to construct circular plots with desired biological features. Example data from human and mouse genomes will be used to demonstrate over thirty plot functions along with the color selection, annotation, labeling, and zoom capabilities. User-guide, take-home video and sample plots from publications will be provided. 2025-07-01 13:00:00 Online Webinar Beginner Software Online Chunhua Yan (CBIIT),Daoud Meerzaman (CBIIT) CBIIT 0 Visualization Tools for Genomic Data
1838
Organized By:
NIH Library
Description

In this hour and half online training, attendees will learn about how to call MATLAB from Python and how to call Python libraries from MATLAB. Attendees will use MATLAB’s Python integration to improve the compatibility and usability of their code.   

By the end of this training, attendees will be able to:  

  • Call Python ...Read More

In this hour and half online training, attendees will learn about how to call MATLAB from Python and how to call Python libraries from MATLAB. Attendees will use MATLAB’s Python integration to improve the compatibility and usability of their code.   

By the end of this training, attendees will be able to:  

  • Call Python libraries  
  • Call user-defined Python commands, scripts, and modules  
  • Package MATLAB algorithms to be called from Python  

Attendees are expected to have some prior knowledge of Python Libraries and/or MATLAB. This training is introductory taught by MathWorks. Installation for MATLAB is not needed. 

In this hour and half online training, attendees will learn about how to call MATLAB from Python and how to call Python libraries from MATLAB. Attendees will use MATLAB’s Python integration to improve the compatibility and usability of their code.    By the end of this training, attendees will be able to:   Call Python libraries   Call user-defined Python commands, scripts, and modules   Package MATLAB algorithms to be called from Python   Attendees are expected to have some prior knowledge of Python Libraries and/or MATLAB. This training is introductory taught by MathWorks. Installation for MATLAB is not needed.  2025-07-01 13:00:00 Online Webinar Beginner Programming Online Mathworks NIH Library 0 Work Together: MATLAB and Python Interoperability
1830
Organized By:
BTEP
Description

This lesson will introduce the tidyverse package, dplyr. Attendees will primarily learn how to filter rows and select columns from data frames.

 

This lesson will introduce the tidyverse package, dplyr. Attendees will primarily learn how to filter rows and select columns from data frames.

 

This lesson will introduce the tidyverse package, dplyr. Attendees will primarily learn how to filter rows and select columns from data frames.   2025-07-01 14:00:00 Online Beginner Programming Online Alex Emmons (BTEP) BTEP 0 Subsetting Data with dplyr
1839
Organized By:
NIH Library
Description

This 45-minute online training provides a high-level overview of recent developments in artificial intelligence (AI). Each session highlights emerging trends, tools, and use cases in the evolving AI landscape, with an emphasis on practical relevance and responsible use. Whether you're just getting started or looking to stay current, this training offers timely insights in a concise format.  

By the end of this training, attendees will be able to:   Read More

This 45-minute online training provides a high-level overview of recent developments in artificial intelligence (AI). Each session highlights emerging trends, tools, and use cases in the evolving AI landscape, with an emphasis on practical relevance and responsible use. Whether you're just getting started or looking to stay current, this training offers timely insights in a concise format.  

By the end of this training, attendees will be able to:   

  • Summarize key trends and developments in AI 
  • Identify new tools, capabilities, or applications relevant to their work 
  • Describe considerations for ethical and responsible use of AI technologies 

Attendees are not expected to have any prior knowledge to be successful in this training.

This 45-minute online training provides a high-level overview of recent developments in artificial intelligence (AI). Each session highlights emerging trends, tools, and use cases in the evolving AI landscape, with an emphasis on practical relevance and responsible use. Whether you're just getting started or looking to stay current, this training offers timely insights in a concise format.   By the end of this training, attendees will be able to:    Summarize key trends and developments in AI  Identify new tools, capabilities, or applications relevant to their work  Describe considerations for ethical and responsible use of AI technologies  Attendees are not expected to have any prior knowledge to be successful in this training. 2025-07-02 12:00:00 Online Webinar Beginner Artificial Intelligence (Al) Online Alicia Lillich (NIH Library) NIH Library 0 AI Update: What's New in Artificial Intelligence?
1840
Organized By:
NIH Library
Description

This hour-and-a-half online training will examine how humans process and encode visual information and how visual attributes can be utilized to create effective visualizations. This will focus on enhancing graphic literacy, exploring methods for making better visualizations, and using stakeholder needs to guide your design choices.

By the end of this training, attendees will be able to:

  • Discuss the value of ...Read More

This hour-and-a-half online training will examine how humans process and encode visual information and how visual attributes can be utilized to create effective visualizations. This will focus on enhancing graphic literacy, exploring methods for making better visualizations, and using stakeholder needs to guide your design choices.

By the end of this training, attendees will be able to:

  • Discuss the value of data visualization and key visualization goals
  • Provide an introduction to human perception and its role in visualization
  • Describe the principles of visual encoding.
  • Provide an overview of core visualization techniques
  • Outline the steps for effectively presenting your visualizations to different audiences.
This hour-and-a-half online training will examine how humans process and encode visual information and how visual attributes can be utilized to create effective visualizations. This will focus on enhancing graphic literacy, exploring methods for making better visualizations, and using stakeholder needs to guide your design choices. By the end of this training, attendees will be able to: Discuss the value of data visualization and key visualization goals Provide an introduction to human perception and its role in visualization Describe the principles of visual encoding. Provide an overview of core visualization techniques Outline the steps for effectively presenting your visualizations to different audiences. 2025-07-07 10:00:00 Online Webinar Beginner Data Online Doug Joubert (NIH Library) NIH Library 0 Principles of Effective Data Visualization
1860
Organized By:
CBIIT
Description

Join Drs. Eytan Ruppin (presenter) and Timothy Shaw (moderator) as they present on four approaches for predicting how patients respond to checkpoint immunotherapy.

  • Approach #1: Predicting patient response to the tumor bulk transcriptome
  • Approach #2: Predicting response directly from the blood via simple routine lab tests and the tumor mutational burden
  • Approach #3: Predicting patient immunotherapy response from the tumor histopathological images
  • Approach #4: Building predictors of the tumor ...Read More

Join Drs. Eytan Ruppin (presenter) and Timothy Shaw (moderator) as they present on four approaches for predicting how patients respond to checkpoint immunotherapy.

  • Approach #1: Predicting patient response to the tumor bulk transcriptome
  • Approach #2: Predicting response directly from the blood via simple routine lab tests and the tumor mutational burden
  • Approach #3: Predicting patient immunotherapy response from the tumor histopathological images
  • Approach #4: Building predictors of the tumor microenvironment and developing spatially grounded biomarkers of treatment response
Join Drs. Eytan Ruppin (presenter) and Timothy Shaw (moderator) as they present on four approaches for predicting how patients respond to checkpoint immunotherapy. Approach #1: Predicting patient response to the tumor bulk transcriptome Approach #2: Predicting response directly from the blood via simple routine lab tests and the tumor mutational burden Approach #3: Predicting patient immunotherapy response from the tumor histopathological images Approach #4: Building predictors of the tumor microenvironment and developing spatially grounded biomarkers of treatment response 2025-07-08 12:00:00 Online Webinar Any Cancer Online Eytan Ruppin MD Ph.D (CCR Cancer Data Science Lab) CBIIT 0 Predicting Patients' Response to Immunotherapy from Tumor Histopathology and Blood: Computational Science in Immuno-oncology
1831
Organized By:
BTEP
Description

This lesson will introduce the "split-apply-combine" approach to data analysis and the key players in the dplyr package used to implement this type of workflow.  

This lesson will introduce the "split-apply-combine" approach to data analysis and the key players in the dplyr package used to implement this type of workflow.  

This lesson will introduce the "split-apply-combine" approach to data analysis and the key players in the dplyr package used to implement this type of workflow.   2025-07-08 14:00:00 Online Beginner Programming Online Alex Emmons (BTEP) BTEP 0 Summarizing Data with dplyr
1865
Coding Club Seminar Series

Organized By:
BTEP
Description
This session of the BTEP Coding Club will demonstrate the use of R programming to perform decision tree analysis, survival tree analysis, and random forest. This event complements a Statistics for Lunch event, "Decision Trees, Survival Trees, and Random Forest", organized by the Advanced Biomedical Computational Science group at the Frederick National Laboratory for Cancer Research. The Statistics for Lunch event will provide a ...Read More
This session of the BTEP Coding Club will demonstrate the use of R programming to perform decision tree analysis, survival tree analysis, and random forest. This event complements a Statistics for Lunch event, "Decision Trees, Survival Trees, and Random Forest", organized by the Advanced Biomedical Computational Science group at the Frederick National Laboratory for Cancer Research. The Statistics for Lunch event will provide a theoretical introduction to these topics, while this coding club session will focus on practical implementation using R.
The session will cover the following: 1. Decision Tree Analysis
The decision tree analysis will use the “kyphosis” dataset to predict the absence or presence of kyphosis (a type of deformation) following corrective spinal surgery.
2. Survival Tree Analysis
The survival tree analysis uses the recurrence-free survival time from a prospective randomized clinical trial conducted by the German Breast Cancer Study Group.
3. Random Forest
Random forest will be applied to the German Credit Data set, which contains 20 variables for 1000 individuals, to determine whether they should or should not receive a loan of a given amount.   This class requires knowledge and experience with R programming.  
This session of the BTEP Coding Club will demonstrate the use of R programming to perform decision tree analysis, survival tree analysis, and random forest. This event complements a Statistics for Lunch event, "Decision Trees, Survival Trees, and Random Forest", organized by the Advanced Biomedical Computational Science group at the Frederick National Laboratory for Cancer Research. The Statistics for Lunch event will provide a theoretical introduction to these topics, while this coding club session will focus on practical implementation using R. The session will cover the following: 1. Decision Tree Analysis The decision tree analysis will use the “kyphosis” dataset to predict the absence or presence of kyphosis (a type of deformation) following corrective spinal surgery. 2. Survival Tree Analysis The survival tree analysis uses the recurrence-free survival time from a prospective randomized clinical trial conducted by the German Breast Cancer Study Group. 3. Random Forest Random forest will be applied to the German Credit Data set, which contains 20 variables for 1000 individuals, to determine whether they should or should not receive a loan of a given amount.   This class requires knowledge and experience with R programming.   2025-07-09 11:00:00 Online Webinar Intermediate Programming,Statistics Online Brian Luke (Advanced Biomedical Computational Science ABCS) BTEP 1 Decision Trees, Survival Trees, and Random Forest: Practical Examples with R Programming
1866
Join Meeting
Organized By:
CDSL
Description

From image analysis to federated learning and multimodal modeling, generative AI can be used to study observational succession across scales and data types. At DCEG's Data Science and Engineering Research Group (DSERG), we are developing user-facing, FAIR, privacy-preserving infrastructure for cancer research based on numerical embedding shared across, and between data types. This presentation is configured as a show-and-tell activity with live applications ranging from digital pathology to real-time epidemiology trackers. Particular ...Read More

From image analysis to federated learning and multimodal modeling, generative AI can be used to study observational succession across scales and data types. At DCEG's Data Science and Engineering Research Group (DSERG), we are developing user-facing, FAIR, privacy-preserving infrastructure for cancer research based on numerical embedding shared across, and between data types. This presentation is configured as a show-and-tell activity with live applications ranging from digital pathology to real-time epidemiology trackers. Particular focus will be placed on showing where and how does genAI modeling expose a numeric representation of the underlying latent space.

 CDr. Jonas Almeida leads a multidisciplinary program of data science and engineering research that combines systems biology, computational statistics, and software engineering for biomedical applications. The primary focus of his research is to accelerate the investigation of epidemiologic and genetic causes of cancer by developing innovative digital methods that advance the computational research infrastructure for precision prevention.

From image analysis to federated learning and multimodal modeling, generative AI can be used to study observational succession across scales and data types. At DCEG's Data Science and Engineering Research Group (DSERG), we are developing user-facing, FAIR, privacy-preserving infrastructure for cancer research based on numerical embedding shared across, and between data types. This presentation is configured as a show-and-tell activity with live applications ranging from digital pathology to real-time epidemiology trackers. Particular focus will be placed on showing where and how does genAI modeling expose a numeric representation of the underlying latent space.  CDr. Jonas Almeida leads a multidisciplinary program of data science and engineering research that combines systems biology, computational statistics, and software engineering for biomedical applications. The primary focus of his research is to accelerate the investigation of epidemiologic and genetic causes of cancer by developing innovative digital methods that advance the computational research infrastructure for precision prevention. 2025-07-09 11:00:00 Building 35A, Room 610 (NIH Bethesda Campus) Any Artificial Intelligence (Al) Hybrid Jonas Almeida (NCI DCEG) CDSL 0 Generative AI, it's all about Embeddings
1832
Organized By:
BTEP
Description

This is the final lesson in the course Introductory R for Novices: Introduction to Data Wrangling. This lesson will show attendees how to join multiple data frames and transform and create new variables using dplyr.

This is the final lesson in the course Introductory R for Novices: Introduction to Data Wrangling. This lesson will show attendees how to join multiple data frames and transform and create new variables using dplyr.

This is the final lesson in the course Introductory R for Novices: Introduction to Data Wrangling. This lesson will show attendees how to join multiple data frames and transform and create new variables using dplyr. 2025-07-15 14:00:00 Online Beginner Programming Online Alex Emmons (BTEP) BTEP 0 Joining and Transforming Data with dplyr
1810
Organized By:
NIDDK
Description

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives

1. The learner should know the difference between observational studies, clinical trials (drug and non-drug studies), and secondary data (new data from stored samples, existing data) as defined for the NIH Clinical Center and how study development differs for each.

2. The learner should understand the ...Read More

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives

1. The learner should know the difference between observational studies, clinical trials (drug and non-drug studies), and secondary data (new data from stored samples, existing data) as defined for the NIH Clinical Center and how study development differs for each.

2. The learner should understand the development process, know the timeline, and know the resources available for successful protocol development.

3. The learner should understand the purpose and scope of ClinicalTrials.gov.

4. The learner should be able to identify and understand key data elements and each step of trial registration and reporting.

5. The learner should be able to understand the differences between a scientific hypothesis and a statistical hypothesis.

6. The learner should be able to translate scientific hypotheses into statistical design elements: study design, primary outcomes, statistical hypotheses, sample size calculation, and statistical analysis plan.

 

Tentative Webinar Outline:

2:30-3:00pm – Dr. Paige Studlack(Clinical Protocol Coordinator, NIDDK)

Research study types, timelines, and process for successful protocol development, IRB approval, and study initiation at the NIH, with particular emphasis on NIDDK resources and processes.

3:00– 3:30pm – Dr. Elizabeth Wright (Mathematical Statistician, Biostatistics Program Office, NIDDK)

Understanding ClinicalTrial.gov elements and how they are used in trial registration and reporting for studies at the NIH.

3:30-4:00pm – Dr. Sungyoung Auh (Mathematical Statistician, Biostatistics Program Office, NIDDK)

Translating scientific questions to needed statistical design elements for research study planning, documentation, completion, and reporting.

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data. Learning Objectives 1. The learner should know the difference between observational studies, clinical trials (drug and non-drug studies), and secondary data (new data from stored samples, existing data) as defined for the NIH Clinical Center and how study development differs for each. 2. The learner should understand the development process, know the timeline, and know the resources available for successful protocol development. 3. The learner should understand the purpose and scope of ClinicalTrials.gov. 4. The learner should be able to identify and understand key data elements and each step of trial registration and reporting. 5. The learner should be able to understand the differences between a scientific hypothesis and a statistical hypothesis. 6. The learner should be able to translate scientific hypotheses into statistical design elements: study design, primary outcomes, statistical hypotheses, sample size calculation, and statistical analysis plan.   Tentative Webinar Outline: 2:30-3:00pm – Dr. Paige Studlack(Clinical Protocol Coordinator, NIDDK) Research study types, timelines, and process for successful protocol development, IRB approval, and study initiation at the NIH, with particular emphasis on NIDDK resources and processes. 3:00– 3:30pm – Dr. Elizabeth Wright (Mathematical Statistician, Biostatistics Program Office, NIDDK) Understanding ClinicalTrial.gov elements and how they are used in trial registration and reporting for studies at the NIH. 3:30-4:00pm – Dr. Sungyoung Auh (Mathematical Statistician, Biostatistics Program Office, NIDDK) Translating scientific questions to needed statistical design elements for research study planning, documentation, completion, and reporting. 2025-07-17 14:30:00 Online Webinar Beginner Online Sungyoung Auh PhD (NIDDK),Paige Studlack (NIDDK),Elizabeth Wright (NIDDK) NIDDK 0 NIDDK Biostats Seminar Series: Initiation, Regulatory Requirements, and Statistical Design for Research Studies Conducted at the NIH
1861
Organized By:
NCI
Description

Considerations for protecting private data when training AI models is a topic of increasing concern. During this event, participants will discuss the use of synthetic data for privacy-preserving AI.

Considerations for protecting private data when training AI models is a topic of increasing concern. During this event, participants will discuss the use of synthetic data for privacy-preserving AI.

Considerations for protecting private data when training AI models is a topic of increasing concern. During this event, participants will discuss the use of synthetic data for privacy-preserving AI. 2025-07-22 11:00:00 Any Artificial Intelligence (Al) Online Heidi Hanson (ORNL),Pearse Keane (Univ College London),Danielle Bitterman (Harvard Medical School) NCI 0 Cancer AI Conversations: Using Synthetic Data for Privacy-Preserving AI
1841
Organized By:
NIH Library
Description

This one and a half-hour online training covers the basic principles of FAIR (Findable, Accessible, Interoperable, Reusable) data and why it is important to make your data FAIR.  This is an introductory level training.

  •  By the end of this training, attendees will be able to:  
  • Define FAIR data   
  • Explain what purpose FAIR data ...Read More

This one and a half-hour online training covers the basic principles of FAIR (Findable, Accessible, Interoperable, Reusable) data and why it is important to make your data FAIR.  This is an introductory level training.

  •  By the end of this training, attendees will be able to:  
  • Define FAIR data   
  • Explain what purpose FAIR data serves 
  • Apply FAIR data principles to make data findable, accessible, interoperable, and reusable 
This one and a half-hour online training covers the basic principles of FAIR (Findable, Accessible, Interoperable, Reusable) data and why it is important to make your data FAIR.  This is an introductory level training.  By the end of this training, attendees will be able to:   Define FAIR data    Explain what purpose FAIR data serves  Apply FAIR data principles to make data findable, accessible, interoperable, and reusable  2025-07-22 12:00:00 Online Webinar Beginner Data Online Raisa Ionin (NIH Library) NIH Library 0 How to Make Your Data FAIR
1842
Organized By:
NIH Library
Description

This one-hour training, provided by a presenter from SAS, will demonstrate tips and tricks to make your SAS code run more efficiently. There are at least six ways to do most things in SAS, so understanding some coding guidelines can help to guide efficient decisions. Attendees are expected to have some working experience with SAS 9.4 or to have attended an introductory SAS class, such as Read More

This one-hour training, provided by a presenter from SAS, will demonstrate tips and tricks to make your SAS code run more efficiently. There are at least six ways to do most things in SAS, so understanding some coding guidelines can help to guide efficient decisions. Attendees are expected to have some working experience with SAS 9.4 or to have attended an introductory SAS class, such as SAS® Programming 1: Essentials.  

  • By the end of this training, attendees will be able to:   
  • Measure performance of SAS code
  • Describe how to create readable code
  • Discuss tips for basic coding recommendations and developing code 
This one-hour training, provided by a presenter from SAS, will demonstrate tips and tricks to make your SAS code run more efficiently. There are at least six ways to do most things in SAS, so understanding some coding guidelines can help to guide efficient decisions. Attendees are expected to have some working experience with SAS 9.4 or to have attended an introductory SAS class, such as SAS® Programming 1: Essentials.   By the end of this training, attendees will be able to:    Measure performance of SAS code Describe how to create readable code Discuss tips for basic coding recommendations and developing code  2025-07-23 11:00:00 Online Webinar Intermediate Software Online Instructor (SAS) NIH Library 0 Top 20 Ways to Optimize Your SAS Code
1843
Organized By:
NIH Library
Description

The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem. 

This hour and half online training will explore the ...Read More

The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem. 

This hour and half online training will explore the topics of perception and cognition, and how these apply to data visualization. This training will also teach you how to visualize your data using ggplot2. We will start by creating a simple scatterplot and use that to introduce aesthetic mappings and geometric objects, the fundamental building blocks of ggplot2. You must have taken Introduction to R and RStudio training to be successful in this training. 

You can register for the other training in this series via the link below.

By the end of this training, participants should be able to: 

  • Describe how perception and cognition inform visualizations.
  • Distinguish between aesthetic mappings and geometric objects, the fundamental building blocks of ggplot.
  • Create a simple scatterplot.
  • Create a plot and save it in a high-resolution format.
  • Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following:
  • Installed R and RStudio.
  • Have a basic understanding of R and RStudio.
  • Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R.
The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem.  This hour and half online training will explore the topics of perception and cognition, and how these apply to data visualization. This training will also teach you how to visualize your data using ggplot2. We will start by creating a simple scatterplot and use that to introduce aesthetic mappings and geometric objects, the fundamental building blocks of ggplot2. You must have taken Introduction to R and RStudio training to be successful in this training.  You can register for the other training in this series via the link below. Data Visualization in R: Customizations Part 2 of 2 By the end of this training, participants should be able to:  Describe how perception and cognition inform visualizations. Distinguish between aesthetic mappings and geometric objects, the fundamental building blocks of ggplot. Create a simple scatterplot. Create a plot and save it in a high-resolution format. Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following: Installed R and RStudio. Have a basic understanding of R and RStudio. Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R. 2025-07-24 12:00:00 Online Webinar Intermediate Programming Online Doug Joubert (NIH Library) NIH Library 0 Data Visualization in R: Introduction to ggplot, Part 1 of 2
1814
Organized By:
NIDDK
Description

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives:

1.         To delineate features of REDCap to support project management for research studies.

2.         To outline steps to create detailed data collection plans which fulfill regulatory requirements. 

3.         To identify principled approaches to data collection and management. 

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives:

1.         To delineate features of REDCap to support project management for research studies.

2.         To outline steps to create detailed data collection plans which fulfill regulatory requirements. 

3.         To identify principled approaches to data collection and management. 

4.         To explain the connections between research rigor and reproducibility.

 

Outline:

2:30-3:00pm – Matthew Breymaier (Informatics Specialist, Office of the Clinical Director, NIDDK), Sai Theja (Senior Data Analyst, Office of the Clinical Director, NIDDK)

RedCap – functionality and basics of setup and how different types of studies can be designed in RedCap (longitudinal vs cross-sectional etc), with emphasis on NIDDK RedCap.  

 

3:00– 4:00pm – Dr. Kenneth Wilkins (Mathematical Statistician, Biostatistics Program Office, NIDDK)

Document organization and access as part of study planning: regulatory, clinical, and case report forms 

Data Management and Sharing Plans

Data Management for Reproducibility

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data. Learning Objectives: 1.         To delineate features of REDCap to support project management for research studies. 2.         To outline steps to create detailed data collection plans which fulfill regulatory requirements.  3.         To identify principled approaches to data collection and management.  4.         To explain the connections between research rigor and reproducibility.   Outline: 2:30-3:00pm – Matthew Breymaier (Informatics Specialist, Office of the Clinical Director, NIDDK), Sai Theja (Senior Data Analyst, Office of the Clinical Director, NIDDK) RedCap – functionality and basics of setup and how different types of studies can be designed in RedCap (longitudinal vs cross-sectional etc), with emphasis on NIDDK RedCap.     3:00– 4:00pm – Dr. Kenneth Wilkins (Mathematical Statistician, Biostatistics Program Office, NIDDK) Document organization and access as part of study planning: regulatory, clinical, and case report forms  Data Management and Sharing Plans Data Management for Reproducibility 2025-07-24 14:30:00 Online Webinar Beginner Statistics Online Kenneth Wilkins (NIDDK),Matthew Breymaier (NIDDK),Sai Theja (NIDDK) NIDDK 0 NIDDK Biostats Seminar Series: Principles of Data Collection and Management
1844
Organized By:
NIH Library
Description

The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem. 

This one hour and half online training builds on ...Read More

The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem. 

This one hour and half online training builds on the topics covered in the Data Visualization in ggplot training. This training emphasizes advanced customization techniques in ggplot, to create effective and clear visualizations. Participants will build on the foundational skills learned in Part 1 of the series and apply various customization options, such as faceting, labeling, themes, and color scales.  You must have taken Data Visualization in R: Introduction to ggplot: Part 1 of 2 training to be successful in this training.  

By the end of this training, attendees should be able to:  

  • Create a scatterplot in ggplot 
  • Learn how to facet a plot 
  • Demonstrate options for customizing the title and axis 
  • Apply different ggplot themes Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following:
  • Installed R and RStudio.
  • Have a basic understanding of R and RStudio.
  • Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R.
  • You can register for the training in this series via the link below:  
  • Data Visualization in R:  introduction to ggplot Part 1 of 2 
The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem.  This one hour and half online training builds on the topics covered in the Data Visualization in ggplot training. This training emphasizes advanced customization techniques in ggplot, to create effective and clear visualizations. Participants will build on the foundational skills learned in Part 1 of the series and apply various customization options, such as faceting, labeling, themes, and color scales.  You must have taken Data Visualization in R: Introduction to ggplot: Part 1 of 2 training to be successful in this training.   By the end of this training, attendees should be able to:   Create a scatterplot in ggplot  Learn how to facet a plot  Demonstrate options for customizing the title and axis  Apply different ggplot themes Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following: Installed R and RStudio. Have a basic understanding of R and RStudio. Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R. You can register for the training in this series via the link below:   Data Visualization in R:  introduction to ggplot Part 1 of 2  2025-07-28 10:00:00 Online Webinar Intermediate Programming Online Doug Joubert (NIH Library) NIH Library 0 Data Visualization in R: Customization, Part 2 of 2
1862
Organized By:
BTEP
Description

Globus is a GUI-based software suitable for efficiently transferring large datasets such as those generated from Next Generation Sequencing (NGS) to and from high performance computing systems such as NIH’s Biowulf. This demonstration only class will show participants how to access and setup Globus for transferring data from local computer as well as sequencing facility data management environment (DME) to Biowulf. This class is open only to NIH staff. Meeting link will be ...Read More

Globus is a GUI-based software suitable for efficiently transferring large datasets such as those generated from Next Generation Sequencing (NGS) to and from high performance computing systems such as NIH’s Biowulf. This demonstration only class will show participants how to access and setup Globus for transferring data from local computer as well as sequencing facility data management environment (DME) to Biowulf. This class is open only to NIH staff. Meeting link will be provided upon approval of registration.

Registeration link: https://cbiit.webex.com/weblink/register/re8dc373ea594662a5e3b0e92a71582fb

Globus is a GUI-based software suitable for efficiently transferring large datasets such as those generated from Next Generation Sequencing (NGS) to and from high performance computing systems such as NIH’s Biowulf. This demonstration only class will show participants how to access and setup Globus for transferring data from local computer as well as sequencing facility data management environment (DME) to Biowulf. This class is open only to NIH staff. Meeting link will be provided upon approval of registration. Registeration link: https://cbiit.webex.com/weblink/register/re8dc373ea594662a5e3b0e92a71582fb 2025-07-29 14:00:00 Online Webinar Beginner Online Joe Wu (BTEP) BTEP 0 Beginner's Guide to Data Transfer using Globus
1812
Distinguished Speakers Seminar Series

Organized By:
BTEP
Description

The role of computational science in biomedical research has typically been downstream of experiments, where it plays important roles in signal processing, data integration, pattern detection, and hypothesis testing. But this is changing, and predictive models are now being used to generate and test hypotheses in silico. In this talk, Dr. Pollard will share examples from human genetics, where they have built deep learning models of 3D chromatin interactions that take only ...Read More

The role of computational science in biomedical research has typically been downstream of experiments, where it plays important roles in signal processing, data integration, pattern detection, and hypothesis testing. But this is changing, and predictive models are now being used to generate and test hypotheses in silico. In this talk, Dr. Pollard will share examples from human genetics, where they have built deep learning models of 3D chromatin interactions that take only sequence as input and then used them to interpret disease variants. This strategy leads to causal hypotheses and enables them to prioritize variants with predicted functional effects. Experiments designed using model outputs are accelerating the rate of discoveries, shedding light on genetic mechanisms in cancer and developmental disorders. This prediction-first strategy exemplifies Dr. Pollard's vision for a more proactive, rather than reactive, role for computational science in biomedical research.

The role of computational science in biomedical research has typically been downstream of experiments, where it plays important roles in signal processing, data integration, pattern detection, and hypothesis testing. But this is changing, and predictive models are now being used to generate and test hypotheses in silico. In this talk, Dr. Pollard will share examples from human genetics, where they have built deep learning models of 3D chromatin interactions that take only sequence as input and then used them to interpret disease variants. This strategy leads to causal hypotheses and enables them to prioritize variants with predicted functional effects. Experiments designed using model outputs are accelerating the rate of discoveries, shedding light on genetic mechanisms in cancer and developmental disorders. This prediction-first strategy exemplifies Dr. Pollard's vision for a more proactive, rather than reactive, role for computational science in biomedical research. 2025-07-31 13:00:00 Online Webinar Any Omics Online Katie Pollard (UCSF) BTEP 1 Predicting Genetic Variants that Alter 3D Genome Folding in Cancer and Developmental Disorders
1815
Organized By:
NIDDK
Description

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives:

Be able to identify, load, and use R resources/packages based upon needs and experience level with R.  

1. For beginners, know how to load R Commander, import data, and navigate the GUI.  

2. For those interested in ...Read More

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives:

Be able to identify, load, and use R resources/packages based upon needs and experience level with R.  

1. For beginners, know how to load R Commander, import data, and navigate the GUI.  

2. For those interested in learning more about coding/functions, how to use R Swirl to learn foundations for functions and coding higher level operations (loops, combining functions, and building new functions).  

3. For regular users of R, how to use tidyverse for data manipulation, organization, and preparation for analysis.  

4. For those using R for research work, how to utilize R Markdown for appropriate and thorough project documentation and management.  

 

Outline:

2:30-3:00pm –Beginner level (Dr. Wilkins, Mathematical Statistician, Biostatistics Program Office, NIDDK)

How to get the basics accomplished: load data, navigate RCommander GUI, and export data.

3:00– 3:30pm – Intermediate level (Dr. Leary, Chief, Biostatistics Program Office, NIDDK)

Data manipulation and organization for analysis with focus on tools for more complex coding and functionality.

3:30-4:00pm – Advanced topics (Dr. Leary)

Leveraging R Markdown and other resources for project management, documentation, and archiving.

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data. Learning Objectives: Be able to identify, load, and use R resources/packages based upon needs and experience level with R.   1. For beginners, know how to load R Commander, import data, and navigate the GUI.   2. For those interested in learning more about coding/functions, how to use R Swirl to learn foundations for functions and coding higher level operations (loops, combining functions, and building new functions).   3. For regular users of R, how to use tidyverse for data manipulation, organization, and preparation for analysis.   4. For those using R for research work, how to utilize R Markdown for appropriate and thorough project documentation and management.     Outline: 2:30-3:00pm –Beginner level (Dr. Wilkins, Mathematical Statistician, Biostatistics Program Office, NIDDK) How to get the basics accomplished: load data, navigate RCommander GUI, and export data. 3:00– 3:30pm – Intermediate level (Dr. Leary, Chief, Biostatistics Program Office, NIDDK) Data manipulation and organization for analysis with focus on tools for more complex coding and functionality. 3:30-4:00pm – Advanced topics (Dr. Leary) Leveraging R Markdown and other resources for project management, documentation, and archiving. 2025-07-31 14:30:00 Online Webinar Beginner Statistics Online Emily Leary (NIDDK),Kenneth Wilkins (NIDDK) NIDDK 0 NIDDK Biostats Seminar Series: R is for All
1863
Organized By:
NIH Library
Description

This one-hour online training will provide a high-level overview of Python coding concepts, as well as some of the integrative development environments (IDEs, such as Jupyter notebooks) used for Python coding. Python is a programming language used for data science, specifically: data analysis, statistical analysis, and visualization of results. The training will feature the following IDEs: Google Colaboratory: Jupyter Notebook; and Anaconda’s: Spyder, Jupyter Notebook, and JupyterLab. ...Read More

This one-hour online training will provide a high-level overview of Python coding concepts, as well as some of the integrative development environments (IDEs, such as Jupyter notebooks) used for Python coding. Python is a programming language used for data science, specifically: data analysis, statistical analysis, and visualization of results. The training will feature the following IDEs: Google Colaboratory: Jupyter Notebook; and Anaconda’s: Spyder, Jupyter Notebook, and JupyterLab. This overview training will demonstrate how these skills can boost productivity, rigor, and transparency in reporting research findings.  

By the end of the training, attendees will be able to: 

  • Recognize four freely available IDEs for python coding 

  • Identify fundamental components of python code 

  • Understand how and why notebooks support rigor and transparency in analysis 

Attendees are not expected to have any prior knowledge of python coding or the IDEs to be successful in this training.  

If you choose to follow along with Google Colab or Jupyter Notebooks, these IDEs should be installed and ready to go. Code will be provided during the training for this option. 

This one-hour online training will provide a high-level overview of Python coding concepts, as well as some of the integrative development environments (IDEs, such as Jupyter notebooks) used for Python coding. Python is a programming language used for data science, specifically: data analysis, statistical analysis, and visualization of results. The training will feature the following IDEs: Google Colaboratory: Jupyter Notebook; and Anaconda’s: Spyder, Jupyter Notebook, and JupyterLab. This overview training will demonstrate how these skills can boost productivity, rigor, and transparency in reporting research findings.   By the end of the training, attendees will be able to:  Recognize four freely available IDEs for python coding  Identify fundamental components of python code  Understand how and why notebooks support rigor and transparency in analysis  Attendees are not expected to have any prior knowledge of python coding or the IDEs to be successful in this training.   If you choose to follow along with Google Colab or Jupyter Notebooks, these IDEs should be installed and ready to go. Code will be provided during the training for this option.  2025-08-07 10:00:00 Online Webinar Beginner Programming Online Cindy Sheffield (NIH Library) NIH Library 0 Python for Data Science: How to Get Started, What to Learn, and Why
1864
Organized By:
NHLBI
Description

Join the National Heart, Lung, and Blood Institute (NHLBI) for a hybrid workshop to explore how medicine will be transformed by the current artificial intelligence (AI) revolution. Participants will engage with leading experts to learn the current state and visionary outlook of the field and to identify research gaps and opportunities in AI, focusing on clinical decision support.

The primary aim of this workshop is to ...Read More

Join the National Heart, Lung, and Blood Institute (NHLBI) for a hybrid workshop to explore how medicine will be transformed by the current artificial intelligence (AI) revolution. Participants will engage with leading experts to learn the current state and visionary outlook of the field and to identify research gaps and opportunities in AI, focusing on clinical decision support.

The primary aim of this workshop is to explore how AI can be utilized to aid in diagnosing and treating heart, lung, blood, and sleep disorders (HLBS). AI is a computer science field focused on creating systems that perform tasks requiring human intelligence. These tasks include learning, reasoning, problem-solving, perception, language understanding, and interaction. AI includes subfields like machine learning, natural language processing, robotics, and computer vision. In biomedical research and health care, AI analyzes complex datasets, enhances diagnostic accuracy, personalizes treatment plans, and improves healthcare delivery. 

Additionally, the workshop aligns with the broader mission of NHLBI to promote the prevention and treatment of heart, lung, and blood diseases, and enhance the health of all individuals so that they can live longer and more fulfilling lives.

Join the National Heart, Lung, and Blood Institute (NHLBI) for a hybrid workshop to explore how medicine will be transformed by the current artificial intelligence (AI) revolution. Participants will engage with leading experts to learn the current state and visionary outlook of the field and to identify research gaps and opportunities in AI, focusing on clinical decision support. The primary aim of this workshop is to explore how AI can be utilized to aid in diagnosing and treating heart, lung, blood, and sleep disorders (HLBS). AI is a computer science field focused on creating systems that perform tasks requiring human intelligence. These tasks include learning, reasoning, problem-solving, perception, language understanding, and interaction. AI includes subfields like machine learning, natural language processing, robotics, and computer vision. In biomedical research and health care, AI analyzes complex datasets, enhances diagnostic accuracy, personalizes treatment plans, and improves healthcare delivery.  Additionally, the workshop aligns with the broader mission of NHLBI to promote the prevention and treatment of heart, lung, and blood diseases, and enhance the health of all individuals so that they can live longer and more fulfilling lives. 2025-09-09 09:00:00 Bethesda, BLDG 45 Natcher Conference Center Any Artificial Intelligence (Al) Hybrid NHLBI 0 AI for Clinical Decision Support in Heart, Lung, Blood and Sleep Disorders Workshop
1827
Organized By:
NCI Office of Data Sharing
Description

Please use this link to access overview, registration, and other information:

https://events.cancer.gov/nci/ods-data-jamboree

Childhood cancer is a rare disease with ~15,000 cases diagnosed annually in the United States in individuals younger than 20 years. Despite extensive efforts made over the last two decade by programs such as National Institutes of Health (NIH)'s Gabriela Miller Kids First Programand&...Read More

Please use this link to access overview, registration, and other information:

https://events.cancer.gov/nci/ods-data-jamboree

Childhood cancer is a rare disease with ~15,000 cases diagnosed annually in the United States in individuals younger than 20 years. Despite extensive efforts made over the last two decade by programs such as National Institutes of Health (NIH)'s Gabriela Miller Kids First Programand NCI's Therapeutically Applicable Research to Generate Effective Treatments (TARGET) and Childhood Cancer Data Initiative (CCDI) to generate, collect and share the data, pediatric and AYA cancer datasets remain underutilized. Finding and accessing datasets, building specific pediatric cancer cohorts, and aggregating or linking datasets from various data systems still present tremendous challenges for the wider community. To overcome these barriers and raise awareness of existing childhood cancer data resources to inform better diagnosis and treatment options for children, this data jamboree is to bring together researchers and citizen scientists with diverse expertise and experience to collaborate and explore scientific or other questions using childhood cancer data. The goals of the jamboree include:

  • Promoting access and reuse of pediatric cancer data and raising awareness about the availability of these datasets.
  • Promoting interdisciplinary collaborations to expand the size, technical, and scientific diversity of the pediatric cancer research community.
  • Promoting development of new methods and tools for data analysis.
  • Identifying gaps and limitations of existing data and resources including barriers to real time access to the data.
 
Please use this link to access overview, registration, and other information: https://events.cancer.gov/nci/ods-data-jamboree Childhood cancer is a rare disease with ~15,000 cases diagnosed annually in the United States in individuals younger than 20 years. Despite extensive efforts made over the last two decade by programs such as National Institutes of Health (NIH)'s Gabriela Miller Kids First Programand NCI's Therapeutically Applicable Research to Generate Effective Treatments (TARGET) and Childhood Cancer Data Initiative (CCDI) to generate, collect and share the data, pediatric and AYA cancer datasets remain underutilized. Finding and accessing datasets, building specific pediatric cancer cohorts, and aggregating or linking datasets from various data systems still present tremendous challenges for the wider community. To overcome these barriers and raise awareness of existing childhood cancer data resources to inform better diagnosis and treatment options for children, this data jamboree is to bring together researchers and citizen scientists with diverse expertise and experience to collaborate and explore scientific or other questions using childhood cancer data. The goals of the jamboree include: Promoting access and reuse of pediatric cancer data and raising awareness about the availability of these datasets. Promoting interdisciplinary collaborations to expand the size, technical, and scientific diversity of the pediatric cancer research community. Promoting development of new methods and tools for data analysis. Identifying gaps and limitations of existing data and resources including barriers to real time access to the data.   2025-09-29 08:30:00 FAES Classrooms and Terrace (Building 10, Bethesda) Any Data In-Person Emily Boja (NCI),Jaime M. Guidry Auvil Ph.D. (CBIIT) NCI Office of Data Sharing 0 Enhancing Childhood Cancer Data Sharing and Utility