Upcoming Classes & Events
January
Description
Alzheimer’s disease and related dementias (ADRD) remain a major health crisis with profound social and economic burdens. Innovative strategies are needed to identify genetic risk and protective factors, model disease mechanisms, and accelerate therapeutic discovery. Advances in AI and informatics now enable the integration of multimodal genetics, omics, imaging, and outcome data from large biobanks, creating powerful opportunities for biomarker and gene discovery beyond categorical diagnoses. At the same time, generative AI Read More
Alzheimer’s disease and related dementias (ADRD) remain a major health crisis with profound social and economic burdens. Innovative strategies are needed to identify genetic risk and protective factors, model disease mechanisms, and accelerate therapeutic discovery. Advances in AI and informatics now enable the integration of multimodal genetics, omics, imaging, and outcome data from large biobanks, creating powerful opportunities for biomarker and gene discovery beyond categorical diagnoses. At the same time, generative AI and large language models (LLMs) extend these capabilities to text-rich sources such as scientific literature, clinical notes, and caregiver narratives. When integrated with knowledge graphs, LLMs can dynamically retrieve and synthesize domain-specific knowledge, improving interpretability and advancing biomarker and drug discovery. Equally important, AI applied to conversational datasets and social media can help uncover caregiver needs and power novel mental health support tools. This talk will highlight how these approaches can advance both the science of ADRD and the care of older adults and their caregivers.
Organized by
NIH LibraryDescription
This one-hour online training offers an overview of the NIH-sponsored Generalist Repository Ecosystem Initiative (GREI) (Dataverse, Dryad, Figshare, Mendeley Data, Open Science Framework, Vivli, and Zenodo), and the role of participating in these repositories in the NIH data repository landscape for intramural researchers. The session will highlight how these repositories support compliance with the NIH Data Management and Sharing Policy.
By the end of this training, attendees will be able to: Read More
This one-hour online training offers an overview of the NIH-sponsored Generalist Repository Ecosystem Initiative (GREI) (Dataverse, Dryad, Figshare, Mendeley Data, Open Science Framework, Vivli, and Zenodo), and the role of participating in these repositories in the NIH data repository landscape for intramural researchers. The session will highlight how these repositories support compliance with the NIH Data Management and Sharing Policy.
By the end of this training, attendees will be able to:
- Describe how generalist repositories fit into the NIH data repository landscape for intramural researchers.
- Understand how these repositories support compliance with the NIH Data Management and Sharing Policy
- Learn about the resources developed by GREI repositories to support data sharing workflows, including a generalist repository comparison chart, a generalist repository selection flowchart, a data submission checklist, and a data management and sharing plan guide.
- Gain practical insights from real-world examples, demonstrating how researchers use generalist repositories for data sharing and reuse, and how these efforts contribute to the broader NIH data sharing ecosystem.
Attendees are not expected to have any prior knowledge of the NIH Data Repository Landscape.
Organized by
NCI Center for Cancer TrainingDescription
This is the first class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. The focus for this session is univariate data analysis and participants will learn about:
- Descriptive statistics
- Random sampling
- Estimating a population mean
This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.Read More
This is the first class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. The focus for this session is univariate data analysis and participants will learn about:
- Descriptive statistics
- Random sampling
- Estimating a population mean
This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.
Coding Club Seminar Series
Description
Organized by
CBIITDescription
Attend this webinar to learn more about XNAT Scout—a new extension of the XNAT imaging informatics platform that’s designed to close the gap between artificial intelligence (AI) model development and clinical deployment.
Washington University’s Dr. Daniel Marcus will introduce XNAT Scout’s architecture, key capabilities, and early deployment experiences. XNAT Scout provides structured tools for assembling training cohorts, managing annotations, benchmarking models, and monitoring performance over time. Integrated with Read More
Attend this webinar to learn more about XNAT Scout—a new extension of the XNAT imaging informatics platform that’s designed to close the gap between artificial intelligence (AI) model development and clinical deployment.
Washington University’s Dr. Daniel Marcus will introduce XNAT Scout’s architecture, key capabilities, and early deployment experiences. XNAT Scout provides structured tools for assembling training cohorts, managing annotations, benchmarking models, and monitoring performance over time. Integrated with XNAT’s mature imaging workflows and governance frameworks, it enables reproducible validation, multi-site collaboration, and deployment pathways aligned with clinical interoperability and security requirements. By unifying data curation, evaluation, and operationalization in one platform, XNAT Scout accelerates translation and supports health systems in safely adopting AI at scale.
XNAT is a a globally used, open-source imaging informatics platform funded by the NCI Informatics Technology for Cancer Research (ITCR) program.
Organized by
NCI Center for Cancer TrainingDescription
This is the second class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. In this class participants will learn about:
- Inferential statistics
- Two sample tests of means
- Small versus large sample consideration
This class will take place virtually over Webex and will be recorded. For Read More
This is the second class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. In this class participants will learn about:
- Inferential statistics
- Two sample tests of means
- Small versus large sample consideration
This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.
Organized by
NCIDescription
Join the NCI Cohort Consortium for a webinar on innovative approaches to improving data interoperability across cohort studies. The session will highlight efforts to apply the Observational Medical Outcomes Partnership Common Data Model to survey data and introduce tools, including Code Map capability, that enable mapping across data models based on Common Data Element semantic concepts. These approaches are designed to enhance data integration and support collaborative cancer cohort research.
Join the NCI Cohort Consortium for a webinar on innovative approaches to improving data interoperability across cohort studies. The session will highlight efforts to apply the Observational Medical Outcomes Partnership Common Data Model to survey data and introduce tools, including Code Map capability, that enable mapping across data models based on Common Data Element semantic concepts. These approaches are designed to enhance data integration and support collaborative cancer cohort research.
Organized by
NCI Center for Cancer TrainingDescription
This is the third class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. Here, participants will learn about:
- Bivariate statistics
- Linear regression
- Regression diagnostics
This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.
This is the third class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training. Here, participants will learn about:
- Bivariate statistics
- Linear regression
- Regression diagnostics
This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.
Description
Learn about Data Science from Dr. Mark Jensen, Director of Data Science, Center for Operations and Technical Support (CTOS), Bioinformatics and Computational Science (BACS), Leidos Biomedical Research, Inc.
Learn about Data Science from Dr. Mark Jensen, Director of Data Science, Center for Operations and Technical Support (CTOS), Bioinformatics and Computational Science (BACS), Leidos Biomedical Research, Inc.
Organized by
NCI Center for Cancer TrainingDescription
In the final class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training participants will learn about:
- Multivariant data analysis
- Regression
- Non-parametric tools
- Goodness of fit tests
This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.Read More
In the final class of the Statistical Analysis of Research Data (SARD) series offered by NCI Center for Cancer Training participants will learn about:
- Multivariant data analysis
- Regression
- Non-parametric tools
- Goodness of fit tests
This class will take place virtually over Webex and will be recorded. For questions, contact Terry Moody (moodyt@bprb.nci.nih.gov). Note that registration is limited to 100.
February
Organized by
NIH LibraryDescription
This one-hour online training provides researchers with an overview of online resources for locating research datasets, data repositories, and data publications for data sharing and re-use. Participants will learn search strategies for locating datasets through federated data search portals and generalist data repositories, including directories for locating discipline-specific and institutional data repositories. An overview of key issues to consider when re-using datasets or when locating a data repository for sharing Read More
This one-hour online training provides researchers with an overview of online resources for locating research datasets, data repositories, and data publications for data sharing and re-use. Participants will learn search strategies for locating datasets through federated data search portals and generalist data repositories, including directories for locating discipline-specific and institutional data repositories. An overview of key issues to consider when re-using datasets or when locating a data repository for sharing and preservation purposes will be discussed.
By the end of this training, attendees will be able to:
-
Locate different types of data repositories and datasets
-
Identify issues to consider with data repositories
- Discuss how data repositories can improve reproducibility
-
Identify issues to consider when re-using datasets
-
Describe guidelines and resources for citing datasets
Attendees are not expected to have any prior knowledge of these resources to be successful in this training.
Organized by
NIH LibraryDescription
This hour-and-a-half online training will examine how humans process and encode visual information and how visual attributes can be utilized to create effective visualizations. This will focus on enhancing graphic literacy, exploring methods for making better visualizations, and using stakeholder needs to guide your design choices.
By the end of this training, attendees will be able to:
- Analyze how different visual encodings affect the accuracy of data interpretation. <Read More
This hour-and-a-half online training will examine how humans process and encode visual information and how visual attributes can be utilized to create effective visualizations. This will focus on enhancing graphic literacy, exploring methods for making better visualizations, and using stakeholder needs to guide your design choices.
By the end of this training, attendees will be able to:
- Analyze how different visual encodings affect the accuracy of data interpretation.
- Use Gestalt principles and preattentive attributes to design visualizations that improve clarity, grouping, and rapid perception.
- Evaluate the appropriateness of color scales.
- Identify and correct common visualization pitfalls.
Organized by
NIH LibraryDescription
This one-hour and thirty minute online training is part one of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation. &Read More
This one-hour and thirty minute online training is part one of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.
By the end of part one of this training series, attendees will be able to:
-
Understand data management best practices
-
Become familiar with data management tools
-
Have a solid knowledge of the resources, enabling data sharing
During Part 2, attendees will learn about sharing and archiving data. You must register separately for Part 2 of this training. This training is introductory, no prior knowledge required.
Organized by
NIH LibraryDescription
This hour and half online training is part two of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.
By the Read More
This hour and half online training is part two of an introductory two-part series for those who want to learn about research data management and sharing, or for those who are interested in a refresher. The series provides detailed information on managing and sharing data from the first data planning stage, through the data life cycle, to data archiving, and finally to selecting an appropriate repository for data preservation.
By the end of part two of this training series, attendees will be able to:
- Have a solid knowledge of the resources, enabling data sharing
- Understand how data is archived and preserved
- Part 1 of this training covers understanding research data, how to manage research data, and how to work with data. During Part 2, attendees learn about sharing and archiving data. This training is introductory, no prior knowledge required.
Organized by
NIH LibraryDescription
In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.
This four-hour online training will address fundamental statistical concepts including Read More
In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.
This four-hour online training will address fundamental statistical concepts including hypothesis testing, p-values and confidence intervals, types of data and their distributional importance, and bias and confounding. Time will be devoted to questions from attendees and references will be provided for in-depth self-study.
By the end of this training, attendees will be able to:
-
Describe key concepts in statistical procedures
-
Understand the steps involved in hypothesis testing
-
Define p-values and be familiar with their appropriate uses
-
Describe confidence intervals and their uses
-
Understand differences in types of data and how to summarize them
-
Describe bias and confounding
Organized by
NIH LibraryDescription
This one and a half hour online training equips participants with powerful data wrangling techniques using R and the tidyverse ecosystem. The tidyverse is a cohesive ecosystem of R packages designed to make data science workflows more intuitive and efficient through consistent syntax and design principles. Designed for both beginners and those looking to refine their skills, this training addresses the challenges posed by messy datasets.
By Read More
This one and a half hour online training equips participants with powerful data wrangling techniques using R and the tidyverse ecosystem. The tidyverse is a cohesive ecosystem of R packages designed to make data science workflows more intuitive and efficient through consistent syntax and design principles. Designed for both beginners and those looking to refine their skills, this training addresses the challenges posed by messy datasets.
By the end of this training, attendees will be able to
- Diagnose and address common data quality issues in clinical datasets.
- Apply systematic approaches to clean and standardize text, dates, and numerical values.
- Transform messy data and handle missing values using tidyverse functions, including appropriate imputation strategies.
- Design reproducible, automated data-cleaning workflows with tidyverse tools for transformation and aggregation.
Requirements
Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following:
Organized by
NIH LibraryDescription
This one hour and half hour online training will equip attendees with essential knowledge and skills for effective interactions with Large Language Model (LLM) AI chatbots. Explore the intricacies of prompt engineering and its pivotal role in optimizing the conversational capabilities of LLMs. Emphasizing best practices and practical applications, this training features live demonstrations and provides valuable skills for the effective use of LLMs.
This one hour and half hour online training will equip attendees with essential knowledge and skills for effective interactions with Large Language Model (LLM) AI chatbots. Explore the intricacies of prompt engineering and its pivotal role in optimizing the conversational capabilities of LLMs. Emphasizing best practices and practical applications, this training features live demonstrations and provides valuable skills for the effective use of LLMs.
By the end of this training, attendees will be able to:
- Define LLMs, prompt patterns, and prompt engineering
- Identify potential uses and issues to consider when using LLMs in the biomedical research field
- Use a selection of prompt patterns to improve generated output from LLMs
- Identify resources for learning more about prompt engineering in LLMs
Attendees are not expected to have any prior knowledge of AI chatbots to be successful in this training.
March
Distinguished Speakers Seminar Series
Description
In this talk, Dr. Carey will describe how Bioconductor approaches new challenges in supporting open method development and reproducible
analyses in genomic data science. He will discuss aspects of the project that bear on education in cancer epidemiology and
computational cancer genomics, and on emerging topics in software and data engineering for scalable omics analyses.
In this talk, Dr. Carey will describe how Bioconductor approaches new challenges in supporting open method development and reproducible
analyses in genomic data science. He will discuss aspects of the project that bear on education in cancer epidemiology and
computational cancer genomics, and on emerging topics in software and data engineering for scalable omics analyses.
Organized by
NCIDescription
This 3-day, virtual workshop will explore how foundation models—a powerful class of advanced AI models —can transform cancer research and clinical care. We will focus on their potential to improve diagnosis, prognosis, and treatment response, with a strong emphasis on clinical translation and technology development.
Key Topics:- Foundation Read More
This 3-day, virtual workshop will explore how foundation models—a powerful class of advanced AI models —can transform cancer research and clinical care. We will focus on their potential to improve diagnosis, prognosis, and treatment response, with a strong emphasis on clinical translation and technology development.
Key Topics:Agenda (https://events.cancer.gov/dctd/foundationmodel/agenda)