ncibtep@nih.gov

Bioinformatics Training and Education Program

Classes & Events

class_id details description start_date Venues learning_levels Topic Tags delivery_method presenters Organizer seminar_series class_title
1982
Organized By:
CBIIT
Description

Attend this webinar to learn more about XNAT Scout—a new extension of the XNAT imaging informatics platform that’s designed to close the gap between artificial intelligence (AI) model development and clinical deployment.

Washington University’s Dr. Daniel Marcus will introduce XNAT Scout’s architecture, key capabilities, and early deployment experiences. XNAT Scout provides structured tools for assembling training cohorts, managing annotations, benchmarking models, and monitoring performance over time. Integrated with ...Read More

Attend this webinar to learn more about XNAT Scout—a new extension of the XNAT imaging informatics platform that’s designed to close the gap between artificial intelligence (AI) model development and clinical deployment.

Washington University’s Dr. Daniel Marcus will introduce XNAT Scout’s architecture, key capabilities, and early deployment experiences. XNAT Scout provides structured tools for assembling training cohorts, managing annotations, benchmarking models, and monitoring performance over time. Integrated with XNAT’s mature imaging workflows and governance frameworks, it enables reproducible validation, multi-site collaboration, and deployment pathways aligned with clinical interoperability and security requirements. By unifying data curation, evaluation, and operationalization in one platform, XNAT Scout accelerates translation and supports health systems in safely adopting AI at scale.

XNAT is a a globally used, open-source imaging informatics platform funded by the NCI Informatics Technology for Cancer Research (ITCR) program.

Attend this webinar to learn more about XNAT Scout—a new extension of the XNAT imaging informatics platform that’s designed to close the gap between artificial intelligence (AI) model development and clinical deployment. Washington University’s Dr. Daniel Marcus will introduce XNAT Scout’s architecture, key capabilities, and early deployment experiences. XNAT Scout provides structured tools for assembling training cohorts, managing annotations, benchmarking models, and monitoring performance over time. Integrated with XNAT’s mature imaging workflows and governance frameworks, it enables reproducible validation, multi-site collaboration, and deployment pathways aligned with clinical interoperability and security requirements. By unifying data curation, evaluation, and operationalization in one platform, XNAT Scout accelerates translation and supports health systems in safely adopting AI at scale. XNAT is a a globally used, open-source imaging informatics platform funded by the NCI Informatics Technology for Cancer Research (ITCR) program. 2026-01-22 11:00:00 Online Any Artificial Intelligence (Al) Online Daniel Marcus (Washington University School of Medicine in St. Louis) CBIIT 0 XNAT Scout: Enabling Translational AI
2012
Organized By:
NCI Center for Cancer Training
Description

This is the second class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. In this class participants will learn about:

  • Inferential statistics 
  • Two sample tests of means
  • Small versus large sample consideration

This class will take place virtually over Webex and will be recorded. For ...Read More

This is the second class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. In this class participants will learn about:

  • Inferential statistics 
  • Two sample tests of means
  • Small versus large sample consideration

This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.

This is the second class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. In this class participants will learn about: Inferential statistics  Two sample tests of means Small versus large sample consideration This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100. 2026-01-23 13:00:00 Online Any Statistics Online Paul Thurman (Columbia University) NCI Center for Cancer Training 0 Inferential Statistics
2015
Organized By:
CCR Single Cell Analysis and Sequencing Facilities
Description

A discussion examining the opportunities, risks, and real-world considerations of using generative AI in healthcare delivery.

A discussion examining the opportunities, risks, and real-world considerations of using generative AI in healthcare delivery.

A discussion examining the opportunities, risks, and real-world considerations of using generative AI in healthcare delivery. 2026-01-27 11:00:00 Online Any Artificial Intelligence (Al) Online Nigam Shah MBBS PhD (Stanford University),Aaron Boussina PhD (UCSD),Sarah Dunsmore PhD (NCATS) CCR Single Cell Analysis and Sequencing Facilities 0 Cancer AI Conversations: Evaluating Generative AI Use in Healthcare Delivery
1984
Organized By:
NCI
Description

Join the NCI Cohort Consortium for a webinar on innovative approaches to improving data interoperability across cohort studies. The session will highlight efforts to apply the Observational Medical Outcomes Partnership Common Data Model to survey data and introduce tools, including Code Map capability, that enable mapping across data models based on Common Data Element semantic concepts. These approaches are designed to enhance data integration and support collaborative cancer cohort research.

Join the NCI Cohort Consortium for a webinar on innovative approaches to improving data interoperability across cohort studies. The session will highlight efforts to apply the Observational Medical Outcomes Partnership Common Data Model to survey data and introduce tools, including Code Map capability, that enable mapping across data models based on Common Data Element semantic concepts. These approaches are designed to enhance data integration and support collaborative cancer cohort research.

Join the NCI Cohort Consortium for a webinar on innovative approaches to improving data interoperability across cohort studies. The session will highlight efforts to apply the Observational Medical Outcomes Partnership Common Data Model to survey data and introduce tools, including Code Map capability, that enable mapping across data models based on Common Data Element semantic concepts. These approaches are designed to enhance data integration and support collaborative cancer cohort research. 2026-01-27 12:00:00 Online Any Programming Online Denise Warzel (CBIIT),Nicole Gerlanc (NCI/DCEG) NCI 0 Improving Data Interoperability Across Cohort Studies
2013
Organized By:
NCI Center for Cancer Training
Description

This is the third class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. Here, participants will learn about:

  • Bivariate statistics
  • Linear regression
  • Regression diagnostics

This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.

This is the third class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. Here, participants will learn about:

  • Bivariate statistics
  • Linear regression
  • Regression diagnostics

This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.

This is the third class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. Here, participants will learn about: Bivariate statistics Linear regression Regression diagnostics This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100. 2026-01-27 13:00:00 Online Any Statistics Online Paul Thurman (Columbia University) NCI Center for Cancer Training 0 Bivariate Statistics
2016
Organized By:
NCI
Description

The SeqSPACE Planning Committee is pleased to announce the second webinar of our two-part webinar series highlighting the work of four junior investigators selected through our recent call for abstracts. Part 2 will feature presentations by Drs. Fei Chen and Harriett Fuller.

Dr. Fei Chen is an assistant professor of clinical population and public health sciences at Keck School of Medicine of USC. Her research focuses on understanding the genetic contribution to risk ...Read More

The SeqSPACE Planning Committee is pleased to announce the second webinar of our two-part webinar series highlighting the work of four junior investigators selected through our recent call for abstracts. Part 2 will feature presentations by Drs. Fei Chen and Harriett Fuller.

Dr. Fei Chen is an assistant professor of clinical population and public health sciences at Keck School of Medicine of USC. Her research focuses on understanding the genetic contribution to risk of common cancer, with a special emphasis on prostate cancer, breast cancer, and multiple primary malignancies (MPMs) in cancer survivors. Dr. Chen will examine the impact of germline pathogenic variants in cancer predisposition genes on the risk of prostate cancer and explore development of cancer risk models.

Dr. Harriett Fuller is a postdoctoral research fellow in genetic epidemiology at Fred Hutchinson Cancer Center. Her research leverages multi-omics data to identify prospective molecular biomarkers for prostate cancer, with a particular emphasis on detecting markers of aggressive disease across population groups. She has expertise in molecular and genetic epidemiology, including biomarker prediction, polygenic risk score development, and Mendelian randomization, as well as a strong background in epidemiological methodology and nutritional epidemiology. Dr. Fuller will describe the results of a Mendelian randomization study exploring the association of specific circulating metabolites with prostate cancer risk.

For more information, please contact Leah Mechanic, Ph.D., M.P.H.

The SeqSPACE Planning Committee is pleased to announce the second webinar of our two-part webinar series highlighting the work of four junior investigators selected through our recent call for abstracts. Part 2 will feature presentations by Drs. Fei Chen and Harriett Fuller. Dr. Fei Chen is an assistant professor of clinical population and public health sciences at Keck School of Medicine of USC. Her research focuses on understanding the genetic contribution to risk of common cancer, with a special emphasis on prostate cancer, breast cancer, and multiple primary malignancies (MPMs) in cancer survivors. Dr. Chen will examine the impact of germline pathogenic variants in cancer predisposition genes on the risk of prostate cancer and explore development of cancer risk models. Dr. Harriett Fuller is a postdoctoral research fellow in genetic epidemiology at Fred Hutchinson Cancer Center. Her research leverages multi-omics data to identify prospective molecular biomarkers for prostate cancer, with a particular emphasis on detecting markers of aggressive disease across population groups. She has expertise in molecular and genetic epidemiology, including biomarker prediction, polygenic risk score development, and Mendelian randomization, as well as a strong background in epidemiological methodology and nutritional epidemiology. Dr. Fuller will describe the results of a Mendelian randomization study exploring the association of specific circulating metabolites with prostate cancer risk. For more information, please contact Leah Mechanic, Ph.D., M.P.H. 2026-01-27 14:30:00 Online Any Next Gen Sequencing (NGS) Methods Online Fei Chen PhD MA (Keck School of Medicine USC),Harriett Fuller PhD MS (Fred Hutchinson Cancer Center) NCI 0 Sequencing Strategies for Population and Cancer Epidemiology Studies SeqSPACE Webinar Series
1997
Join Meeting
Organized By:
LBR
Description

Learn about Data Science from Dr. Mark Jensen, Director of Data Science, Center for Operations and Technical Support (CTOS), Bioinformatics and Computational Science (BACS), Leidos Biomedical Research, Inc.

Learn about Data Science from Dr. Mark Jensen, Director of Data Science, Center for Operations and Technical Support (CTOS), Bioinformatics and Computational Science (BACS), Leidos Biomedical Research, Inc.

Learn about Data Science from Dr. Mark Jensen, Director of Data Science, Center for Operations and Technical Support (CTOS), Bioinformatics and Computational Science (BACS), Leidos Biomedical Research, Inc. 2026-01-28 11:00:00 Online Beginner Data Online Mark Jensen (FNL) LBR 0 What is Data Science?
2007
Navigating Risk in Sharing Cancer Data: Sharing cancer data accelerates scientific discovery and promotes innovation across disciplines, which brings complex ethical, legal, and procedural challenges. This panel will share practical experiences; current frameworks and examples; and emerging tools for ethical, secured and FAIR cancer data sharing to help shape the standards that will define responsible data sharing for years to come. Confronting Challenges to Sharing Cancer Data: Discussion of resources, incentives, obstacles and solutions to confront barriers to effective, efficient and equitable data sharing, including difficult to share data and less common cancer data types. 2026-01-28 13:00:00 Online Any Cancer Online Joseph Dean PhD (Iovance Biotherapeutics),Peter Kraft PhD (NCI),Lucila Ohno-Machado MD PhD MBA (Yale School of Medicine),Kurt Roloff PhD (Duality Technologies),Christopher Amos PhD (UNM Cancer Center),James DuBois DSc PhD (WashU Medicine),Scarlett Gomez PhD MPH (UCSF School of Medicine),James Lacey PhD MPH (City of Hope Cancer Center),Richard Moser PhD (NCI) NCI Office of Data Sharing 0 Navigating Risk in Sharing Cancer Data and Confronting Challenges to Sharing Cancer Data
2014
Organized By:
NCI Center for Cancer Training
Description

In the final class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training participants will learn about:

  • Multivariant data analysis
  • Regression
  • Non-parametric tools
  • Goodness of fit tests

This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.Read More

In the final class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training participants will learn about:

  • Multivariant data analysis
  • Regression
  • Non-parametric tools
  • Goodness of fit tests

This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.

In the final class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training participants will learn about: Multivariant data analysis Regression Non-parametric tools Goodness of fit tests This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100. 2026-01-29 13:00:00 Online Any Statistics Online Paul Thurman (Columiba University) NCI Center for Cancer Training 0 Multivariant Data Analysis
2000
Organized By:
NIH Library
Description

This one-hour online training provides researchers with an overview of online resources for locating research datasets, data repositories, and data publications for data sharing and re-use. Participants will learn search strategies for locating datasets through federated data search portals and generalist data repositories, including directories for locating discipline-specific and institutional data repositories. An overview of key issues to consider when re-using datasets or when locating a data repository for sharing ...Read More

This one-hour online training provides researchers with an overview of online resources for locating research datasets, data repositories, and data publications for data sharing and re-use. Participants will learn search strategies for locating datasets through federated data search portals and generalist data repositories, including directories for locating discipline-specific and institutional data repositories. An overview of key issues to consider when re-using datasets or when locating a data repository for sharing and preservation purposes will be discussed. 

By the end of this training, attendees will be able to:  

  • Locate different types of data repositories and datasets 

  • Identify issues to consider with data repositories 

  • Discuss how data repositories can improve reproducibility
  • Identify issues to consider when re-using datasets 

  • Describe guidelines and resources for citing datasets 

Attendees are not expected to have any prior knowledge of these resources to be successful in this training. 

This one-hour online training provides researchers with an overview of online resources for locating research datasets, data repositories, and data publications for data sharing and re-use. Participants will learn search strategies for locating datasets through federated data search portals and generalist data repositories, including directories for locating discipline-specific and institutional data repositories. An overview of key issues to consider when re-using datasets or when locating a data repository for sharing and preservation purposes will be discussed.  By the end of this training, attendees will be able to:   Locate different types of data repositories and datasets  Identify issues to consider with data repositories  Discuss how data repositories can improve reproducibility Identify issues to consider when re-using datasets  Describe guidelines and resources for citing datasets  Attendees are not expected to have any prior knowledge of these resources to be successful in this training.  2026-02-09 11:00:00 Online Beginner Data Online Joelle Mornini (NIH Library) NIH Library 0 Resources for Finding and Sharing Research Data
1990
Organized By:
NIH Library
Description

This hour-and-a-half online training will examine how humans process and encode visual information and how visual attributes can be utilized to create effective visualizations. This will focus on enhancing graphic literacy, exploring methods for making better visualizations, and using stakeholder needs to guide your design choices.

By the end of this training, attendees will be able to:

  • Analyze how different visual encodings affect the accuracy of data interpretation.
  • <...Read More

This hour-and-a-half online training will examine how humans process and encode visual information and how visual attributes can be utilized to create effective visualizations. This will focus on enhancing graphic literacy, exploring methods for making better visualizations, and using stakeholder needs to guide your design choices.

By the end of this training, attendees will be able to:

  • Analyze how different visual encodings affect the accuracy of data interpretation.
  • Use Gestalt principles and preattentive attributes to design visualizations that improve clarity, grouping, and rapid perception.
  • Evaluate the appropriateness of color scales.
  • Identify and correct common visualization pitfalls.
This hour-and-a-half online training will examine how humans process and encode visual information and how visual attributes can be utilized to create effective visualizations. This will focus on enhancing graphic literacy, exploring methods for making better visualizations, and using stakeholder needs to guide your design choices. By the end of this training, attendees will be able to: Analyze how different visual encodings affect the accuracy of data interpretation. Use Gestalt principles and preattentive attributes to design visualizations that improve clarity, grouping, and rapid perception. Evaluate the appropriateness of color scales. Identify and correct common visualization pitfalls. 2026-02-09 13:00:00 Online Beginner Data Online NIH Library Staff NIH Library 0 Principles of Effective Data Visualization
1991
Organized By:
NIH Library
Description

This one-hour and thirty minute online training is part one of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.  &...Read More

This one-hour and thirty minute online training is part one of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.   

By the end of part one of this training series, attendees will be able to:   

  • Understand data management best practices   

  • Become familiar with data management tools  

  • Have a solid knowledge of the resources, enabling data sharing  

During Part 2, attendees will learn about sharing and archiving data. You must register separately for Part 2 of this training. This training is introductory, no prior knowledge required.  

This one-hour and thirty minute online training is part one of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.    By the end of part one of this training series, attendees will be able to:    Understand data management best practices    Become familiar with data management tools   Have a solid knowledge of the resources, enabling data sharing   During Part 2, attendees will learn about sharing and archiving data. You must register separately for Part 2 of this training. This training is introductory, no prior knowledge required.   2026-02-10 14:00:00 Online Beginner Data Online Raisa Ionin (NIH Library) NIH Library 0 Data Management and Sharing, Part 1 of 2
1992
Organized By:
NIH Library
Description

This hour and half online training is part two of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.  

By the ...Read More

This hour and half online training is part two of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.  

By the end of part two of this training series, attendees will be able to:   

  • Have a solid knowledge of the resources, enabling data sharing  
  • Understand how data is archived and preserved  
  • Part 1 of this training covers understanding research data, how to manage research data, and how to work with data. During Part 2, attendees learn about sharing and archiving data. This training is introductory, no prior knowledge required. 
This hour and half online training is part two of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.   By the end of part two of this training series, attendees will be able to:    Have a solid knowledge of the resources, enabling data sharing   Understand how data is archived and preserved   Part 1 of this training covers understanding research data, how to manage research data, and how to work with data. During Part 2, attendees learn about sharing and archiving data. This training is introductory, no prior knowledge required.  2026-02-11 14:00:00 Online Beginner Data Online Raisa Ionin (NIH Library) NIH Library 0 Data Management and Sharing, Part 2 of 2
1993
Organized By:
NIH Library
Description

In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants  better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature. 

This four-hour online training will address fundamental statistical concepts including ...Read More

In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants  better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature. 

This four-hour online training will address fundamental statistical concepts including hypothesis testing, p-values and confidence intervals, types of data and their distributional importance, and bias and confounding. Time will be devoted to questions from attendees and references will be provided for in-depth self-study.    

By the end of this training, attendees will be able to:  

  • Describe key concepts in statistical procedures

  • Understand the steps involved in hypothesis testing 

  • Define p-values and be familiar with their appropriate uses 

  • Describe confidence intervals and their uses

  • Understand differences in types of data and how to summarize them 

  • Describe bias and confounding

In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants  better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.  This four-hour online training will address fundamental statistical concepts including hypothesis testing, p-values and confidence intervals, types of data and their distributional importance, and bias and confounding. Time will be devoted to questions from attendees and references will be provided for in-depth self-study.     By the end of this training, attendees will be able to:   Describe key concepts in statistical procedures Understand the steps involved in hypothesis testing  Define p-values and be familiar with their appropriate uses  Describe confidence intervals and their uses Understand differences in types of data and how to summarize them  Describe bias and confounding 2026-02-12 13:00:00 Online Intermediate Statistics Online Ninet Sinaii Ph.D. MPH (Biostatistics and Clinical Epidemiology Branch NIH Clinical Center) NIH Library 0 Overview of Statistical Concepts
1994
Organized By:
NIH Library
Description

This one-hour online training will cover the fundamentals, applications, and ethical considerations of Artificial Intelligence (AI). Attendees will explore key topics such as machine learning, deep learning, data handling, and real-world AI applications across various industries. The session will also delve into the ethical implications of AI and provide insights on becoming AI literate. Whether you're a seasoned professional or just starting your AI journey, this session will equip you with essential knowledge to ...Read More

This one-hour online training will cover the fundamentals, applications, and ethical considerations of Artificial Intelligence (AI). Attendees will explore key topics such as machine learning, deep learning, data handling, and real-world AI applications across various industries. The session will also delve into the ethical implications of AI and provide insights on becoming AI literate. Whether you're a seasoned professional or just starting your AI journey, this session will equip you with essential knowledge to navigate the AI landscape effectively and make informed decisions in our data-driven world.

By the end of this training, attendees will be able to: 

  • Understand the core concepts of AI 
  • Recognize the significance of ethical considerations in AI 
  • Begin the journey toward AI literacy

Attendees are not expected to have any prior knowledge of AI to be successful in this training. 

This one-hour online training will cover the fundamentals, applications, and ethical considerations of Artificial Intelligence (AI). Attendees will explore key topics such as machine learning, deep learning, data handling, and real-world AI applications across various industries. The session will also delve into the ethical implications of AI and provide insights on becoming AI literate. Whether you're a seasoned professional or just starting your AI journey, this session will equip you with essential knowledge to navigate the AI landscape effectively and make informed decisions in our data-driven world. By the end of this training, attendees will be able to:  Understand the core concepts of AI  Recognize the significance of ethical considerations in AI  Begin the journey toward AI literacy Attendees are not expected to have any prior knowledge of AI to be successful in this training.  2026-02-20 13:00:00 Online Beginner Artificial Intelligence (Al) Online NIH Library Staff NIH Library 0 AI Literacy: Navigating the World of Artificial Intelligence
1995
Organized By:
NIH Library
Description

This one and a half hour online training equips participants with powerful data wrangling techniques using R and the tidyverse ecosystem. The tidyverse is a cohesive ecosystem of R packages designed to make data science workflows more intuitive and efficient through consistent syntax and design principles. Designed for both beginners and those looking to refine their skills, this training addresses the challenges posed by messy datasets.  

By ...Read More

This one and a half hour online training equips participants with powerful data wrangling techniques using R and the tidyverse ecosystem. The tidyverse is a cohesive ecosystem of R packages designed to make data science workflows more intuitive and efficient through consistent syntax and design principles. Designed for both beginners and those looking to refine their skills, this training addresses the challenges posed by messy datasets.  

By the end of this training, attendees will be able to

  • Diagnose and address common data quality issues in clinical datasets.
  • Apply systematic approaches to clean and standardize text, dates, and numerical values.
  • Transform messy data and handle missing values using tidyverse functions, including appropriate imputation strategies.
  • Design reproducible, automated data-cleaning workflows with tidyverse tools for transformation and aggregation.

Requirements 

Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following:

  • Installed R and RStudio.
  • Have a basic understanding of R and RStudio.
  • Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R
This one and a half hour online training equips participants with powerful data wrangling techniques using R and the tidyverse ecosystem. The tidyverse is a cohesive ecosystem of R packages designed to make data science workflows more intuitive and efficient through consistent syntax and design principles. Designed for both beginners and those looking to refine their skills, this training addresses the challenges posed by messy datasets.   By the end of this training, attendees will be able to Diagnose and address common data quality issues in clinical datasets. Apply systematic approaches to clean and standardize text, dates, and numerical values. Transform messy data and handle missing values using tidyverse functions, including appropriate imputation strategies. Design reproducible, automated data-cleaning workflows with tidyverse tools for transformation and aggregation. Requirements  Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following: Installed R and RStudio. Have a basic understanding of R and RStudio. Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R 2026-02-23 13:00:00 Online Intermediate Programming Online Doug Joubert (NIH Library) NIH Library 0 Taming Messy Data: Practical R Wrangling with the Tidyverse
1996
Organized By:
NIH Library
Description

This one hour and half hour online training will equip attendees with essential knowledge and skills for effective interactions with Large Language Model (LLM) AI chatbots. Explore the intricacies of prompt engineering and its pivotal role in optimizing the conversational capabilities of LLMs. Emphasizing best practices and practical applications, this training features live demonstrations and provides valuable skills for the effective use of LLMs. 

Read More

This one hour and half hour online training will equip attendees with essential knowledge and skills for effective interactions with Large Language Model (LLM) AI chatbots. Explore the intricacies of prompt engineering and its pivotal role in optimizing the conversational capabilities of LLMs. Emphasizing best practices and practical applications, this training features live demonstrations and provides valuable skills for the effective use of LLMs. 

By the end of this training, attendees will be able to:  

  • Define LLMs, prompt patterns, and prompt engineering
  • Identify potential uses and issues to consider when using LLMs in the biomedical research field
  • Use a selection of prompt patterns to improve generated output from LLMs
  • Identify resources for learning more about prompt engineering in LLMs 

Attendees are not expected to have any prior knowledge of AI chatbots to be successful in this training.

This one hour and half hour online training will equip attendees with essential knowledge and skills for effective interactions with Large Language Model (LLM) AI chatbots. Explore the intricacies of prompt engineering and its pivotal role in optimizing the conversational capabilities of LLMs. Emphasizing best practices and practical applications, this training features live demonstrations and provides valuable skills for the effective use of LLMs.  By the end of this training, attendees will be able to:   Define LLMs, prompt patterns, and prompt engineering Identify potential uses and issues to consider when using LLMs in the biomedical research field Use a selection of prompt patterns to improve generated output from LLMs Identify resources for learning more about prompt engineering in LLMs  Attendees are not expected to have any prior knowledge of AI chatbots to be successful in this training. 2026-02-27 13:00:00 Online Beginner Artificial Intelligence (Al) Online Alicia Lillich (NIH Library),Joelle Mornini (NIH Library) NIH Library 0 Best Practices for Prompt Generation in AI Chatbots
1941
Distinguished Speakers Seminar Series

Join Meeting
Organized By:
BTEP
Description

In this talk, Dr. Carey will describe how Bioconductor approaches new challenges in supporting open method development and reproducible
analyses in genomic data science. He will discuss aspects of the project that bear on education in cancer epidemiology and
computational cancer genomics, and on emerging topics in software and data engineering for scalable omics analyses.

In this talk, Dr. Carey will describe how Bioconductor approaches new challenges in supporting open method development and reproducible
analyses in genomic data science. He will discuss aspects of the project that bear on education in cancer epidemiology and
computational cancer genomics, and on emerging topics in software and data engineering for scalable omics analyses.

In this talk, Dr. Carey will describe how Bioconductor approaches new challenges in supporting open method development and reproducibleanalyses in genomic data science. He will discuss aspects of the project that bear on education in cancer epidemiology andcomputational cancer genomics, and on emerging topics in software and data engineering for scalable omics analyses. 2026-03-19 13:00:00 Online Any Software Online Vincent J. Carey (Brigham and Women\'s Hospital Harvard Medical School) BTEP 1 Bioconductor Decade 3: Evolving an Open Ecosystem for Genomic Data Science
1983
Organized By:
NCI
Description
Overview

This 3-day, virtual workshop will explore how foundation models—a powerful class of advanced AI models —can transform cancer research and clinical care. We will focus on their potential to improve diagnosis, prognosis, and treatment response, with a strong emphasis on clinical translation and technology development.

Key Topics:
  1. Foundation ...Read More
Overview

This 3-day, virtual workshop will explore how foundation models—a powerful class of advanced AI models —can transform cancer research and clinical care. We will focus on their potential to improve diagnosis, prognosis, and treatment response, with a strong emphasis on clinical translation and technology development.

Key Topics:
  1. Foundation Model Primer: A high-level introduction to foundation models.
  2. Multimodal Data: Combining pathology, radiology, omics, and patient data into unified models.
  3. Prediction: Predicting therapeutic response, resistance, and patient outcomes.
  4. Validation and Reproducibility: Ensuring model results are consistent and reliable for real-world clinical performance and use.
  5. Diagnostic Case Studies: Real-world applications for early detection and automated diagnostics.
  6. Federated Learning: Approaches to training robust models across multiple institutions—without sharing sensitive patient data
  7. Challenges, Risk, and Regulation: Addressing model interpretability and regulatory considerations for clinical adoption.

Agenda (https://events.cancer.gov/dctd/foundationmodel/agenda)

Overview This 3-day, virtual workshop will explore how foundation models—a powerful class of advanced AI models —can transform cancer research and clinical care. We will focus on their potential to improve diagnosis, prognosis, and treatment response, with a strong emphasis on clinical translation and technology development. Key Topics: Foundation Model Primer: A high-level introduction to foundation models. Multimodal Data: Combining pathology, radiology, omics, and patient data into unified models. Prediction: Predicting therapeutic response, resistance, and patient outcomes. Validation and Reproducibility: Ensuring model results are consistent and reliable for real-world clinical performance and use. Diagnostic Case Studies: Real-world applications for early detection and automated diagnostics. Federated Learning: Approaches to training robust models across multiple institutions—without sharing sensitive patient data Challenges, Risk, and Regulation: Addressing model interpretability and regulatory considerations for clinical adoption. Agenda (https://events.cancer.gov/dctd/foundationmodel/agenda) 2026-03-24 10:00:00 Online Any Artificial Intelligence (Al) Online Asif Rizwan (NCI) NCI 0 Foundational Models for Cancer: Advancing Diagnosis, Prognosis, and Treatment Response
1920
Distinguished Speakers Seminar Series

Join Meeting
Organized By:
BTEP
Description

The ability to measure gene expression levels for individual cells (vs. pools of cells) and with spatial resolution is crucial to address many important biological and medical questions, such as the study of stem cell differentiation, the discovery of cellular subtypes in the brain, and cancer diagnosis and treatment. Single-cell transcriptome sequencing (RNA-Seq) allows the high-throughput measurement of gene expression levels for entire genomes at the resolution of single cells. Spatially-resolved ...Read More

The ability to measure gene expression levels for individual cells (vs. pools of cells) and with spatial resolution is crucial to address many important biological and medical questions, such as the study of stem cell differentiation, the discovery of cellular subtypes in the brain, and cancer diagnosis and treatment. Single-cell transcriptome sequencing (RNA-Seq) allows the high-throughput measurement of gene expression levels for entire genomes at the resolution of single cells. Spatially-resolved transcriptomics further allows the measurement of gene expression levels along with the location of the RNA molecules within a tissue. Transcriptomics exemplifies the range of issues one encounters in a data science workflow, where the data are complex in a variety of ways, questions are not always clearly formulated, there are multiple analysis steps, and drawing on rigorous statistical principles and methods is essential to derive meaningful and reliable biological results. 

In this talk, Dr. Dudoit will provide a survey of statistical questions related to the analysis of single-cell transcriptome sequencing data to investigate the differentiation of stem cells in the brain, including, exploratory data analysis, expression quantitation, cluster analysis, and the inference of cellular lineages. She will also address differential expression analysis in spatial transcriptomics.

The ability to measure gene expression levels for individual cells (vs. pools of cells) and with spatial resolution is crucial to address many important biological and medical questions, such as the study of stem cell differentiation, the discovery of cellular subtypes in the brain, and cancer diagnosis and treatment. Single-cell transcriptome sequencing (RNA-Seq) allows the high-throughput measurement of gene expression levels for entire genomes at the resolution of single cells. Spatially-resolved transcriptomics further allows the measurement of gene expression levels along with the location of the RNA molecules within a tissue. Transcriptomics exemplifies the range of issues one encounters in a data science workflow, where the data are complex in a variety of ways, questions are not always clearly formulated, there are multiple analysis steps, and drawing on rigorous statistical principles and methods is essential to derive meaningful and reliable biological results.  In this talk, Dr. Dudoit will provide a survey of statistical questions related to the analysis of single-cell transcriptome sequencing data to investigate the differentiation of stem cells in the brain, including, exploratory data analysis, expression quantitation, cluster analysis, and the inference of cellular lineages. She will also address differential expression analysis in spatial transcriptomics. 2026-04-16 13:00:00 Online Any Omics Online Sandrine Dudoit (UC Berkeley) BTEP 1 Learning from Data in Single-Cell Transcriptomics