Upcoming Classes & Events
February
Description
The NLM Division of Intramural Research (DIR) is pleased to welcome Tiarnán Keenan, MD, PhD, Stadtman Tenure-Track Investigator and Director of the Medical Retina Fellowship Program at the National Eye Institute to present his lecture entitled "Diverse Applications of Computational Research and Artificial Intelligence in Ophthalmology".
Ophthalmology is ideally positioned to benefit from recent advances in computational data science and artificial intelligence. As a highly image-based specialty, it offers non-invasive, Read More
The NLM Division of Intramural Research (DIR) is pleased to welcome Tiarnán Keenan, MD, PhD, Stadtman Tenure-Track Investigator and Director of the Medical Retina Fellowship Program at the National Eye Institute to present his lecture entitled "Diverse Applications of Computational Research and Artificial Intelligence in Ophthalmology".
Ophthalmology is ideally positioned to benefit from recent advances in computational data science and artificial intelligence. As a highly image-based specialty, it offers non-invasive, high-resolution views of the microvascular circulation and the central nervous system, creating rich opportunities for computational analysis with direct clinical relevance. This seminar will present diverse applications of advanced biostatistics, computational research, and machine learning techniques in ophthalmology, with a focus on age-related macular degeneration, the leading cause of blindness in industrialized countries, and cataract, the leading cause of blindness worldwide. Topics will include automated disease detection, quantitative severity classification, and prognostic prediction of disease progression from retinal imaging data, with and without the integration of genetic information. Methodological themes will span deep feature extraction, label transfer, and multi-modal, multi-task learning frameworks.
The NLM Colloquia on Biomedical Data Science and Computational Biology Research is a series of scientific lectures featuring experts from across the bioinformatics community who present their research and discuss how it contributes to advancing biomedical discovery. This series is presented by NLM’s DIR a premier hub of innovation for computational biology and biomedical data science.
Organized by
NIH LibraryDescription
In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.
This four-hour online training will address fundamental statistical concepts including hypothesis Read More
In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.
This four-hour online training will address fundamental statistical concepts including hypothesis testing, p-values and confidence intervals, types of data and their distributional importance, and bias and confounding. Time will be devoted to questions from attendees and references will be provided for in-depth self-study.
By the end of this training, attendees will be able to:
-
Describe key concepts in statistical procedures
-
Understand the steps involved in hypothesis testing
-
Define p-values and be familiar with their appropriate uses
-
Describe confidence intervals and their uses
-
Understand differences in types of data and how to summarize them
-
Describe bias and confounding
Description
Knowledge of Unix command line is advantageous for scientists who are new to bioinformatics, as many tools are designed to run on Unix-like systems. High performance computing systems (e.g., NIH Biowulf) also require command line skills. Biowulf has around 1000 software, including those for bioinformatics installed and provides more compute power for bioinformatics analyses that are otherwise cumbersome to do on a personal computer. Commands learned in this class will enable novices to sign Read More
Knowledge of Unix command line is advantageous for scientists who are new to bioinformatics, as many tools are designed to run on Unix-like systems. High performance computing systems (e.g., NIH Biowulf) also require command line skills. Biowulf has around 1000 software, including those for bioinformatics installed and provides more compute power for bioinformatics analyses that are otherwise cumbersome to do on a personal computer. Commands learned in this class will enable novices to sign onto Biowulf, navigate through its directories, and work with files, which are essential steps for getting started with bioinformatics. This class is not hands-on.
Description
Proteins mediate the critical processes of life and beautifully solve the challenges faced during the evolution of modern organisms. Dr. Baker’s goal is to design a new generation of proteins that address current-day problems not faced during evolution. In contrast to traditional protein engineering efforts, which have focused on modifying naturally occurring proteins, he designs new proteins from scratch to optimally solve the problem at hand. Increasingly, he develops and Read More
Proteins mediate the critical processes of life and beautifully solve the challenges faced during the evolution of modern organisms. Dr. Baker’s goal is to design a new generation of proteins that address current-day problems not faced during evolution. In contrast to traditional protein engineering efforts, which have focused on modifying naturally occurring proteins, he designs new proteins from scratch to optimally solve the problem at hand. Increasingly, he develops and uses deep learning methods to design amino acid sequences that are predicted to fold to desired structures and functions. They produce synthetic genes encoding these sequences and characterize them experimentally. In this talk, he will describe several recent advances in protein design.
Learning Objectives:
- Understand current methods for computational protein design, including AI-based approaches.
- Survey diverse applications of protein design, such as its role in drug development, materials science, and bioremediation.
- Identify opportunities at the frontier of AI-driven life science research, including where new data, tools, and research projects are needed.
Organized by
CIT Technology Training ProgramDescription
(Recommended after “Getting Started with AI Productivity Double Feature”)
- Learn how to chat with M365 Copilot Chat to get answers and clarity fast
- Practice using Copilot for summarization, insights, and task automation
- Discover translation and other features that support day-to-day work
(Recommended after “Getting Started with AI Productivity Double Feature”)
- Learn how to chat with M365 Copilot Chat to get answers and clarity fast
- Practice using Copilot for summarization, insights, and task automation
- Discover translation and other features that support day-to-day work
Organized by
CBIITDescription
Join us for the latest Data Science Seminar Series to explore how AI, robotics, and predictive models could transform drug development.
Dr. Arvind Ramanathan, a computational science leader at Argonne National Laboratory, will share an approach his lab is developing that uses automated systems to generate and test hypotheses to design cancer therapies.
Join the webinar to explore how:
• autonomous AI agents can compete to Read More
Join us for the latest Data Science Seminar Series to explore how AI, robotics, and predictive models could transform drug development.
Dr. Arvind Ramanathan, a computational science leader at Argonne National Laboratory, will share an approach his lab is developing that uses automated systems to generate and test hypotheses to design cancer therapies.
Join the webinar to explore how:
• autonomous AI agents can compete to generate and test hypotheses for cancer drug design.
• real world models can predict experimental outcomes and guide which experiments to run next in robotic laboratories.
• applications of this approach can target intrinsically disordered proteins (IDP) to design therapies for the oncogenic driver WHSC1 and metabolic regulator NMNAT2.
Description
Partek Flow is a point-and-click platform for building analysis workflows for Next Generation Sequences (NGS), including DNA, bulk and single-cell RNA, spatial transcriptomics, ATAC, and ChIP, helping scientists avoid the steep learning curve of code-based NGS analysis. In this demonstration-only class, an Illumina scientist will show a bulk RNA-sequencing workflow starting from FASTQ files through differential expression, visualization, and pathway enrichment. No prior experience or access to Partek Flow is required. Attendance is limited Read More
Partek Flow is a point-and-click platform for building analysis workflows for Next Generation Sequences (NGS), including DNA, bulk and single-cell RNA, spatial transcriptomics, ATAC, and ChIP, helping scientists avoid the steep learning curve of code-based NGS analysis. In this demonstration-only class, an Illumina scientist will show a bulk RNA-sequencing workflow starting from FASTQ files through differential expression, visualization, and pathway enrichment. No prior experience or access to Partek Flow is required. Attendance is limited to NIH staff.
Description
This session introduces NIH STRIDES and Cloud Lab resources for bioinformatics and generative AI workflows in the cloud. Participants will tour key tutorial libraries, including the STRIDES Cloud Lab GitHub repositories (AWS, GCP, Azure notebooks) and the NIGMS GitHub. The session concludes with a live cloud demonstration showing how to build a grounded chatbot using a Snakemake-based datastore and show responses based on the indexed sources.
This session introduces NIH STRIDES and Cloud Lab resources for bioinformatics and generative AI workflows in the cloud. Participants will tour key tutorial libraries, including the STRIDES Cloud Lab GitHub repositories (AWS, GCP, Azure notebooks) and the NIGMS GitHub. The session concludes with a live cloud demonstration showing how to build a grounded chatbot using a Snakemake-based datastore and show responses based on the indexed sources.
Organized by
CBIITDescription
Dr. John Quackenbush, a Harvard professor and NCI grantee, will describe the conceptual framework for these network tools, demonstrate some practical applications, and explain how they can help you tackle complex biological questions about gene regulation and network inference in your research.
Some of these tools include:
• PANDA (Passing Attributes between Networks for Data Assimilation): Estimate bipartite gene regulatory networks by combining multiple data sources.
<Read More
Dr. John Quackenbush, a Harvard professor and NCI grantee, will describe the conceptual framework for these network tools, demonstrate some practical applications, and explain how they can help you tackle complex biological questions about gene regulation and network inference in your research.
Some of these tools include:
• PANDA (Passing Attributes between Networks for Data Assimilation): Estimate bipartite gene regulatory networks by combining multiple data sources.
• LIONESS (Linear Interpolation to Obtain Network Estimates for Single Samples): Decompose population-level network models into a collection of individual-specific networks.
• MONSTER (MOdeling Network State Transitions from Expression and Regulatory data): Identify key regulatory drivers that control transitions between biological states--such as health and disease.
These, and nearly 20 other tools, are maintained in a cohesive repository, The Network Zoo, with implementations in both R and Python, and NetBooks, online Jupyter Notebook tutorials.
Organized by
NIH LibraryDescription
This one and a half hour online training equips participants with powerful data wrangling techniques using R and the tidyverse ecosystem. The tidyverse is a cohesive ecosystem of R packages designed to make data science workflows more intuitive and efficient through consistent syntax and design principles. Designed for both beginners and those looking to refine their skills, this training addresses the challenges posed by messy datasets.
By Read More
This one and a half hour online training equips participants with powerful data wrangling techniques using R and the tidyverse ecosystem. The tidyverse is a cohesive ecosystem of R packages designed to make data science workflows more intuitive and efficient through consistent syntax and design principles. Designed for both beginners and those looking to refine their skills, this training addresses the challenges posed by messy datasets.
By the end of this training, attendees will be able to
- Diagnose and address common data quality issues in clinical datasets.
- Apply systematic approaches to clean and standardize text, dates, and numerical values.
- Transform messy data and handle missing values using tidyverse functions, including appropriate imputation strategies.
- Design reproducible, automated data-cleaning workflows with tidyverse tools for transformation and aggregation.
Requirements
Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following:
Organized by
NIH GREIDescription
Streamlining Data Sharing: Practical Tools and Researcher Stories from the NIH GREI
A clear and comprehensive Data Management and Sharing (DMS) Plan is essential for meeting NIH policy requirements. This session introduces GREI’s guide to help you incorporate generalist repositories into your DMS Plan (https://doi.org/10.5281/zenodo.14278957), offering recommended language and concrete examples. Learn how to write a stronger, more compliant plan and hear stories from researchers benefiting from sharing Read More
Streamlining Data Sharing: Practical Tools and Researcher Stories from the NIH GREI
A clear and comprehensive Data Management and Sharing (DMS) Plan is essential for meeting NIH policy requirements. This session introduces GREI’s guide to help you incorporate generalist repositories into your DMS Plan (https://doi.org/10.5281/zenodo.14278957), offering recommended language and concrete examples. Learn how to write a stronger, more compliant plan and hear stories from researchers benefiting from sharing data via GREI repositories.
Distinguished Speakers Seminar Series
Description
Scientific discovery is increasingly limited not by data availability, but by our ability to integrate evidence, generate hypotheses, and iteratively test them at scale. Recent advances in foundation models and large language models suggest a new paradigm: AI systems that not only model data, but actively participate in the scientific process as agents. In this talk, I will present a unified view of our recent work on foundation models and agentic systems Read More
Scientific discovery is increasingly limited not by data availability, but by our ability to integrate evidence, generate hypotheses, and iteratively test them at scale. Recent advances in foundation models and large language models suggest a new paradigm: AI systems that not only model data, but actively participate in the scientific process as agents. In this talk, I will present a unified view of our recent work on foundation models and agentic systems that aim to make biomedical knowledge transferable, multi-scale, and scientifically testable.
First, I will discuss Universal Cell Embeddings (UCE), a self-supervised foundation model that produces robust, annotation-free cell representations that generalize across datasets and species, enabling zero-shot transfer for single-cell biology without per-dataset retraining. Building on this “universal” cell representation layer, I will introduce PULSAR, a multi-scale, multicellular architecture that explicitly propagates information from genes to cells to multicellular systems, yielding unified donor-level representations for tasks such as disease classification, biomarker prediction, and forecasting future clinical events in the human immune system.
Second, I will connect these models to the broader agenda of the AI Virtual Cell: high-fidelity, multi-scale neural simulators of cellular state and dynamics, and the key scientific and engineering priorities needed to make them real and useful for biology and medicine. Finally, I will move from models to agents. Biomni defines a general-purpose biomedical agent environment with a large, structured action space grounded in real biomedical tools, software, and databases—enabling LLM-based agents to do biomedical work, not just talk about it. To ensure that agent-generated claims can be validated rigorously, I will present Popper, an agentic hypothesis-validation framework inspired by falsification, combining LLM-driven experimental design with sequential statistical testing and explicit Type-I error control. Together, these systems suggest a path toward AI that learns universal biological representations, composes them across scales, and supports end-to-end discovery loops grounded in tools, data, and statistical rigor.
Organized by
CIT Technology Training ProgramDescription
If you use AI tools even occasionally, you’ve probably spent more time than you’d like rewriting prompts, tweaking outputs, or trying to remember “that one prompt that worked.” This live, hands-on class shows you how to stop starting over. You’ll learn how to turn your best prompts into reusable, high-quality assets—stored and shared using the Microsoft 365 tools you already work in every day. In Read More
If you use AI tools even occasionally, you’ve probably spent more time than you’d like rewriting prompts, tweaking outputs, or trying to remember “that one prompt that worked.” This live, hands-on class shows you how to stop starting over. You’ll learn how to turn your best prompts into reusable, high-quality assets—stored and shared using the Microsoft 365 tools you already work in every day. In under two hours, you’ll learn practical prompt design techniques that work across tools like ChatGPT, Claude, and CHiRP, and how to organize them in Teams, SharePoint, Word, Excel, and Loop so they’re easy to find, reuse, and improve. The focus is real NIH work, responsible AI use, and immediately applicable skills. You’ll leave with ready-to-use templates, example prompts, and a clear system you can apply the same day to save time, improve results, and make AI a reliable part of your workflow—not an experiment you have to rethink each time.
Organized by
NIH LibraryDescription
This one hour and half hour online training will equip attendees with essential knowledge and skills for effective interactions with Large Language Model (LLM) AI chatbots. Explore the intricacies of prompt engineering and its pivotal role in optimizing the conversational capabilities of LLMs. Emphasizing best practices and practical applications, this training features live demonstrations and provides valuable skills for the effective use of LLMs.
This one hour and half hour online training will equip attendees with essential knowledge and skills for effective interactions with Large Language Model (LLM) AI chatbots. Explore the intricacies of prompt engineering and its pivotal role in optimizing the conversational capabilities of LLMs. Emphasizing best practices and practical applications, this training features live demonstrations and provides valuable skills for the effective use of LLMs.
By the end of this training, attendees will be able to:
- Define LLMs, prompt patterns, and prompt engineering
- Identify potential uses and issues to consider when using LLMs in the biomedical research field
- Use a selection of prompt patterns to improve generated output from LLMs
- Identify resources for learning more about prompt engineering in LLMs
Attendees are not expected to have any prior knowledge of AI chatbots to be successful in this training.
March
Description
Qlucore Omics Explorer is a desktop-based point-and-click software with built-in machine learning capabilities. It enables RNA sequencing (bulk and single cell), proteomics and metabolomics analysis. This software is available for NCI CCR scientists upon submitting a ticket at https://service.cancer.gov/ncisp. In this demonstration-only class, Qlucore scientist will illustrate bulk RNA sequencing analysis workflow starting from expression table import through performing normalization, differential expression and pathway analysis, and creating visualizations. Experience using Read More
Qlucore Omics Explorer is a desktop-based point-and-click software with built-in machine learning capabilities. It enables RNA sequencing (bulk and single cell), proteomics and metabolomics analysis. This software is available for NCI CCR scientists upon submitting a ticket at https://service.cancer.gov/ncisp. In this demonstration-only class, Qlucore scientist will illustrate bulk RNA sequencing analysis workflow starting from expression table import through performing normalization, differential expression and pathway analysis, and creating visualizations. Experience using or installation of this software is not required for attendance. Participation is restricted to NIH staff.
Organized by
NIH LibraryDescription
This 45-minute online Lunch and Learn training will help attendees develop their own customized strategy for responsibly incorporating generative artificial intelligence (AI) tools, such as ChatGPT, into their workflows.
By the end of this training, attendees will be able to:
-
Assess appropriate use cases for generative AI tools within their specific research/work context&Read More
This 45-minute online Lunch and Learn training will help attendees develop their own customized strategy for responsibly incorporating generative artificial intelligence (AI) tools, such as ChatGPT, into their workflows.
By the end of this training, attendees will be able to:
-
Assess appropriate use cases for generative AI tools within their specific research/work context
-
Develop a customized generative AI usage strategy
-
Document their approach for using generative AI tools
Attendees are not expected to have any prior knowledge of generative AI tools to be successful in this training.
Organized by
NIH LibraryDescription
This one-hour online training, is the first of a two-part series, which introduces participants to cleaning and exploring a patient health dataset using Python and pandas. Attendees will load tabular data, inspect structure and data types, summarize columns, and identify common data quality problems such as missing values, inconsistent formats, and duplicate records. They will then apply practical fixes, including standardizing height and weight units, parsing and normalizing dates of birth, splitting combined fields, Read More
This one-hour online training, is the first of a two-part series, which introduces participants to cleaning and exploring a patient health dataset using Python and pandas. Attendees will load tabular data, inspect structure and data types, summarize columns, and identify common data quality problems such as missing values, inconsistent formats, and duplicate records. They will then apply practical fixes, including standardizing height and weight units, parsing and normalizing dates of birth, splitting combined fields, and using Boolean masks to flag or correct implausible values.
By the end of this session students will be able to:
- Import CSV data into pandas DataFrames and quickly understand column types, basic statistics, and overall data quality.
- Identify duplicate or repeated patient records and decide whether to keep, correct, or remove them.
- Detect and handle missing or inconsistent values using methods such as isna, fillna, filtering, and conditional replacement.
- Standardize mixed formats (for example, heights with and without units, date strings in different formats, and numeric values embedded in text).
- Create derived columns such as systolic and diastolic blood pressure, and use logical conditions to flag questionable or out-of-range values.
Attendees are expected to have:
- Basic Python coding knowledge
- Familiarity with an IDE and loading script and data files into the IDE. (Colab, Jupyter Notebooks)
Requirements:
- Participants will receive a script file and data files prior to the training. These should be loaded and ready to use before the training session begins.
You can register for Part 2 in this series via the link below:
https://www.nihlibrary.nih.gov/training/introduction-data-wrangling-using-python-part-2-2
Organized by
NIH LibraryDescription
This one-hour online training, the second session of the two-part series, focuses on reshaping and enriching the cleaned patient dataset to prepare it for analysis and reporting. Attendees will practice splitting and recombining columns (for example, separating full names into first and last names), converting columns to appropriate data types, and engineering new fields such as outlier indicators and blood pressure status labels. The session also covers merging multiple tables (patient details, contact Read More
This one-hour online training, the second session of the two-part series, focuses on reshaping and enriching the cleaned patient dataset to prepare it for analysis and reporting. Attendees will practice splitting and recombining columns (for example, separating full names into first and last names), converting columns to appropriate data types, and engineering new fields such as outlier indicators and blood pressure status labels. The session also covers merging multiple tables (patient details, contact information, and subsets of records) and filtering or subsetting data to answer specific analytical questions.
By the end of this training, attendees will be able to:
- Reshape and restructure data by splitting and combining columns, changing data types, and reordering or selecting relevant fields.
- Engineer clinically useful features, including z-score–based outlier flags, hypertension indicators, and combined status columns for downstream models or dashboards.
- Merge and join DataFrames using common keys (such as patient ID) to bring together core data with supplemental tables like contact information.
- Filter and subset records based on multiple conditions (for example, patients with diabetes and abnormal blood pressure) to create analysis-ready datasets.
Attendees are expected to have:
- To have attended Intro to Data Wrangling Using Python - Part 1 of the series
- Basic Python coding knowledge
Familiarity with an IDE and loading script and data files into the IDE. (Colab, Jupyter Notebooks)
Requirements:
- Participants will receive a script file and data files prior to the training. These should be loaded and ready to use before the training session begins.
You can register for Part 1 in this series via the link below:
https://www.nihlibrary.nih.gov/training/introduction-data-wrangling-using-python-part-1-2
Organized by
NIH LibraryDescription
In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.
This three-hour online training will provide a review of study Read More
In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.
This three-hour online training will provide a review of study designs in biomedical research. This training will also cover details related to case studies/series, ecological, cross-sectional, case-control, and cohort studies, clinical trials, and other study designs and considerations. Time will be devoted to questions from attendees and references will be provided for in-depth self-study.
By the end of this training, attendees will be able to:
-
Describe two broad categories of study designs
-
Provide examples of descriptive and analytic studies
-
Explain the advantages and disadvantages of analytic studies
-
Understand the differences between observational and experimental studies
-
List other types of atypical study designs
Organized by
NIH LibraryDescription
This 45-minute online training provides a high-level overview of recent developments in artificial intelligence (AI). Each session highlights emerging trends, tools, and use cases in the evolving AI landscape, with an emphasis on practical relevance and responsible use. Whether you're just getting started or looking to stay current, this training offers timely insights in a concise format.
By the end of this Read More
This 45-minute online training provides a high-level overview of recent developments in artificial intelligence (AI). Each session highlights emerging trends, tools, and use cases in the evolving AI landscape, with an emphasis on practical relevance and responsible use. Whether you're just getting started or looking to stay current, this training offers timely insights in a concise format.
By the end of this training, attendees will be able to:
-
Summarize key trends and developments in AI
-
Identify new tools, capabilities, or applications relevant to their work
-
Describe considerations for ethical and responsible use of AI technologies
Attendees are not expected to have any prior knowledge to be successful in this training.
Distinguished Speakers Seminar Series
Description
In this talk, Dr. Carey will describe how Bioconductor approaches new challenges in supporting open method development and reproducible
analyses in genomic data science. He will discuss aspects of the project that bear on education in cancer epidemiology and
computational cancer genomics, and on emerging topics in software and data engineering for scalable omics analyses.
In this talk, Dr. Carey will describe how Bioconductor approaches new challenges in supporting open method development and reproducible
analyses in genomic data science. He will discuss aspects of the project that bear on education in cancer epidemiology and
computational cancer genomics, and on emerging topics in software and data engineering for scalable omics analyses.
Organized by
NCIDescription
This 3-day, virtual workshop will explore how foundation models—a powerful class of advanced AI models —can transform cancer research and clinical care. We will focus on their potential to improve diagnosis, prognosis, and treatment response, with a strong emphasis on clinical translation and technology development.
Key Topics:- Foundation Read More
This 3-day, virtual workshop will explore how foundation models—a powerful class of advanced AI models —can transform cancer research and clinical care. We will focus on their potential to improve diagnosis, prognosis, and treatment response, with a strong emphasis on clinical translation and technology development.
Key Topics:Agenda (https://events.cancer.gov/dctd/foundationmodel/agenda)
April
Description
Qlucore Omics Explorer is a desktop-based point-and-click software with built-in machine learning capabilities. It enables RNA sequencing (bulk and single cell), proteomics and metabolomics analysis. This software is available for NCI CCR scientists upon submitting a ticket at https://service.cancer.gov/ncisp. In this demonstration-only class, Qlucore scientist will illustrate proteomics analysis workflow starting from data import through performing QC, constructing visualizations (ie. PCA, heatmap, volcano, box, and violin plots),and conducting GSEA. Read More
Qlucore Omics Explorer is a desktop-based point-and-click software with built-in machine learning capabilities. It enables RNA sequencing (bulk and single cell), proteomics and metabolomics analysis. This software is available for NCI CCR scientists upon submitting a ticket at https://service.cancer.gov/ncisp. In this demonstration-only class, Qlucore scientist will illustrate proteomics analysis workflow starting from data import through performing QC, constructing visualizations (ie. PCA, heatmap, volcano, box, and violin plots),and conducting GSEA. Experience using or installation of this software is not required for attendance. Participation is restricted to NIH staff.
Organized by
NIH LibraryDescription
In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.
This six-hour online training will describe the basic concepts for using Read More
In partnership with the NIH Clinical Center's Biostatistics and Clinical Epidemiology Service (BCES), the NIH Library is offering several trainings that cover general concepts behind statistics and epidemiology. These trainings will help participants better understand and prepare data, interpret results and findings, design and prepare studies, and understand the results in published literature.
This six-hour online training will describe the basic concepts for using common statistical tests such as Chi-square, paired and two-sample t-tests, ANOVA, correlations, simple and multiple regression, logistic regression, and survival analysis. Time will be devoted to questions from attendees and references will be provided for in-depth self-study.
By the end of this training, attendees will be able to:
-
Explain the importance of study design and hypothesis
-
Describe types of data and their distributions
-
List examples of statistical tests for analyzing continuous data
-
List examples of statistical tests for analyzing dichotomous or categorical data
-
Understand differences in regression methods
-
Identify nonparametric tests and when to use them
The first part of the class will be 10:00 a.m. to 12:00 p.m. EST followed by a break from 12:00-1:00 p.m. The class resumes at 1:00 p.m. and concludes at 5:00 p.m.
Distinguished Speakers Seminar Series
Description
The ability to measure gene expression levels for individual cells (vs. pools of cells) and with spatial resolution is crucial to address many important biological and medical questions, such as the study of stem cell differentiation, the discovery of cellular subtypes in the brain, and cancer diagnosis and treatment. Single-cell transcriptome sequencing (RNA-Seq) allows the high-throughput measurement of gene expression levels for entire genomes at the resolution of single cells. Spatially-resolved Read More
The ability to measure gene expression levels for individual cells (vs. pools of cells) and with spatial resolution is crucial to address many important biological and medical questions, such as the study of stem cell differentiation, the discovery of cellular subtypes in the brain, and cancer diagnosis and treatment. Single-cell transcriptome sequencing (RNA-Seq) allows the high-throughput measurement of gene expression levels for entire genomes at the resolution of single cells. Spatially-resolved transcriptomics further allows the measurement of gene expression levels along with the location of the RNA molecules within a tissue. Transcriptomics exemplifies the range of issues one encounters in a data science workflow, where the data are complex in a variety of ways, questions are not always clearly formulated, there are multiple analysis steps, and drawing on rigorous statistical principles and methods is essential to derive meaningful and reliable biological results.
In this talk, Dr. Dudoit will provide a survey of statistical questions related to the analysis of single-cell transcriptome sequencing data to investigate the differentiation of stem cells in the brain, including, exploratory data analysis, expression quantitation, cluster analysis, and the inference of cellular lineages. She will also address differential expression analysis in spatial transcriptomics.