Supported by CCR Office of Science and Technology Resources (OSTR)
ncibtep@nih.gov

Bioinformatics Training and Education Program

Featured

Upcoming Classes & Events

July

Organized by
BTEP
Description

This is the final lesson in the course Introductory R for Novices: Introduction to Data Wrangling. This lesson will show attendees how to join multiple data frames and transform and create new variables using dplyr.

This is the final lesson in the course Introductory R for Novices: Introduction to Data Wrangling. This lesson will show attendees how to join multiple data frames and transform and create new variables using dplyr.

Organized by
HPC Biowulf
Description

All Biowulf users, and all those interested in using the systems, are invited to call in to our Virtual Walk-in Consult to discuss problems and concerns, from scripting problems to node allocation, to strategies for a particular project, to anything that is affecting your use of the HPC systems. Users will be assigned to a breakout-session with a member of the HPC staff to discuss the problem one-on-one.  We'll try to address simpler Read More

All Biowulf users, and all those interested in using the systems, are invited to call in to our Virtual Walk-in Consult to discuss problems and concerns, from scripting problems to node allocation, to strategies for a particular project, to anything that is affecting your use of the HPC systems. Users will be assigned to a breakout-session with a member of the HPC staff to discuss the problem one-on-one.  We'll try to address simpler issues on the spot and follow up on more complex questions after the session.

At the consult: You will initially join the  main lobby and triage area.  There, you can briefly describe your issue, and then will be invited to join a 1-on-1 breakout room with a staff member.
Once you are finished with your focused consultation you can return to the main meeting room if you have additional questions or topics to discuss.  Please
- mute when not speaking
- refrain from screen sharing until asked to do so in the breakout room
- screen share as you would in a public space with the understanding that other NIH HPC staff may join and view what you are sharing (i.e. look over your shoulder)
- be prepared to wait your turn if staff are already helping other users

Please contact staff@hpc.nih.gov for the meeting link. 

Organized by
NIDDK
Description

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives

1. The learner should know the difference between observational studies, clinical trials (drug and non-drug studies), and secondary data (new data from stored samples, existing data) as defined for the NIH Clinical Center and how study development differs for each.

2. The learner should understand the Read More

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives

1. The learner should know the difference between observational studies, clinical trials (drug and non-drug studies), and secondary data (new data from stored samples, existing data) as defined for the NIH Clinical Center and how study development differs for each.

2. The learner should understand the development process, know the timeline, and know the resources available for successful protocol development.

3. The learner should understand the purpose and scope of ClinicalTrials.gov.

4. The learner should be able to identify and understand key data elements and each step of trial registration and reporting.

5. The learner should be able to understand the differences between a scientific hypothesis and a statistical hypothesis.

6. The learner should be able to translate scientific hypotheses into statistical design elements: study design, primary outcomes, statistical hypotheses, sample size calculation, and statistical analysis plan.

 

Tentative Webinar Outline:

2:30-3:00pm – Dr. Paige Studlack(Clinical Protocol Coordinator, NIDDK)

Research study types, timelines, and process for successful protocol development, IRB approval, and study initiation at the NIH, with particular emphasis on NIDDK resources and processes.

3:00– 3:30pm – Dr. Elizabeth Wright (Mathematical Statistician, Biostatistics Program Office, NIDDK)

Understanding ClinicalTrial.gov elements and how they are used in trial registration and reporting for studies at the NIH.

3:30-4:00pm – Dr. Sungyoung Auh (Mathematical Statistician, Biostatistics Program Office, NIDDK)

Translating scientific questions to needed statistical design elements for research study planning, documentation, completion, and reporting. 

Organized by
NCI RNA Biology Initiative
Description

Professor Angela Brooks is a leader in the development and application of high-throughput genomic and computational approaches to investigate gene regulation. Her research focuses on characterizing mutations impacting RNA expression in cancer and pioneering methods to analyze alternative splicing using both short- and long-read sequencing technologies. Dr. Brooks has played leadership roles in major consortium efforts, including the Pan-Cancer Analysis of Whole Genomes project, which produced the most comprehensive catalog of RNA alterations across Read More

Professor Angela Brooks is a leader in the development and application of high-throughput genomic and computational approaches to investigate gene regulation. Her research focuses on characterizing mutations impacting RNA expression in cancer and pioneering methods to analyze alternative splicing using both short- and long-read sequencing technologies. Dr. Brooks has played leadership roles in major consortium efforts, including the Pan-Cancer Analysis of Whole Genomes project, which produced the most comprehensive catalog of RNA alterations across multiple cancer types, and the Long-read RNA-seq Genome Annotation Assessment Project, improving transcriptome annotation through long-read sequencing.

Organized by
NCI
Description

Considerations for protecting private data when training AI models is a topic of increasing concern. During this event, participants will discuss the use of synthetic data for privacy-preserving AI.

Considerations for protecting private data when training AI models is a topic of increasing concern. During this event, participants will discuss the use of synthetic data for privacy-preserving AI.

Organized by
NIH Library
Description

This one and a half-hour online training covers the basic principles of FAIR (Findable, Accessible, Interoperable, Reusable) data and why it is important to make your data FAIR.  This is an introductory level training.

  •  By the end of this training, attendees will be able to:  
  • Define FAIR data   
  • Explain what purpose FAIR data Read More

This one and a half-hour online training covers the basic principles of FAIR (Findable, Accessible, Interoperable, Reusable) data and why it is important to make your data FAIR.  This is an introductory level training.

  •  By the end of this training, attendees will be able to:  
  • Define FAIR data   
  • Explain what purpose FAIR data serves 
  • Apply FAIR data principles to make data findable, accessible, interoperable, and reusable 
Organized by
NIH Library
Description

This one-hour training, provided by a presenter from SAS, will demonstrate tips and tricks to make your SAS code run more efficiently. There are at least six ways to do most things in SAS, so understanding some coding guidelines can help to guide efficient decisions. Attendees are expected to have some working experience with SAS 9.4 or to have attended an introductory SAS class, such as 

This one-hour training, provided by a presenter from SAS, will demonstrate tips and tricks to make your SAS code run more efficiently. There are at least six ways to do most things in SAS, so understanding some coding guidelines can help to guide efficient decisions. Attendees are expected to have some working experience with SAS 9.4 or to have attended an introductory SAS class, such as SAS® Programming 1: Essentials.  

  • By the end of this training, attendees will be able to:   
  • Measure performance of SAS code
  • Describe how to create readable code
  • Discuss tips for basic coding recommendations and developing code 
Description

Harnessing the Power of Transcriptomics for Scientific Discovery

Harnessing the Power of Transcriptomics for Scientific Discovery

Organized by
Biobanking for Precision Medicine Seminar Series
Description

This presentation will explore the next emerging stage where biobanking is characterised by being part of system - Biobanking 4.0. Biobanking has always been about data generation with tissue specimens providing the biological and genomic information about a patient or donor. Within Biobanking 4.0 biospecimens are viewed ‘packages’ of information that once unpacked through ‘omics’ technologies provide a complex amount of digital information, able to be deciphered through novel computer science, including Read More

This presentation will explore the next emerging stage where biobanking is characterised by being part of system - Biobanking 4.0. Biobanking has always been about data generation with tissue specimens providing the biological and genomic information about a patient or donor. Within Biobanking 4.0 biospecimens are viewed ‘packages’ of information that once unpacked through ‘omics’ technologies provide a complex amount of digital information, able to be deciphered through novel computer science, including machine learning and AI. Using examples of our local biobanking activity I will define the growing role biobanks have in feeding the informatics pipelines that now function as integrated translational and clinical research data ecosystems enabled via data driven strategies including federated learning. The emerging needs of these ever increasingly intricate data driven systems are projected to have an impact on current biobanking practices requiring a more immediate and dynamic tissue handling activity where sample storage is not as prominent but a more distributed biobanking infrastructure will need to be established, utilizing novel enterprise architecture including blockchain. Finally, consideration of how new strategies for biospecimens management will challenge current dogma around patient engagement through their biospecimens into research including an exploration of the principles around the need for informed consent, anonymization practices, data ownership and commodification, donor agency as well as equity in data access to all researchers.

Organized by
NIH Library
Description

The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem. 

This hour and half online training will explore the Read More

The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem. 

This hour and half online training will explore the topics of perception and cognition, and how these apply to data visualization. This training will also teach you how to visualize your data using ggplot2. We will start by creating a simple scatterplot and use that to introduce aesthetic mappings and geometric objects, the fundamental building blocks of ggplot2. You must have taken Introduction to R and RStudio training to be successful in this training. 

You can register for the other training in this series via the link below.

By the end of this training, participants should be able to: 

  • Describe how perception and cognition inform visualizations.
  • Distinguish between aesthetic mappings and geometric objects, the fundamental building blocks of ggplot.
  • Create a simple scatterplot.
  • Create a plot and save it in a high-resolution format.
  • Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following:
  • Installed R and RStudio.
  • Have a basic understanding of R and RStudio.
  • Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R.
Organized by
NIDDK
Description

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives:

1.         To delineate features of REDCap to support project management for research studies.

2.         To outline steps to create detailed data collection plans which fulfill regulatory Read More

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives:

1.         To delineate features of REDCap to support project management for research studies.

2.         To outline steps to create detailed data collection plans which fulfill regulatory requirements. 

3.         To identify principled approaches to data collection and management. 

4.         To explain the connections between research rigor and reproducibility.

 

Outline:

2:30-3:00pm – Matthew Breymaier (Informatics Specialist, Office of the Clinical Director, NIDDK), Sai Theja (Senior Data Analyst, Office of the Clinical Director, NIDDK)

RedCap – functionality and basics of setup and how different types of studies can be designed in RedCap (longitudinal vs cross-sectional etc), with emphasis on NIDDK RedCap.  

 

3:00– 4:00pm – Dr. Kenneth Wilkins (Mathematical Statistician, Biostatistics Program Office, NIDDK)

Document organization and access as part of study planning: regulatory, clinical, and case report forms 

Data Management and Sharing Plans

Data Management for Reproducibility 

Join Meeting
Organized by
Rare Disease Informatics SIG
Description

Dr. Lee Lancashire is a data scientist with over 20 years of experience in machine learning and statistics, currently serving as Head of Data Science at Every Cure. He earned his PhD specializing in artificial neural network methodologies, developing some of the first applications of neural nets to large-scale bioinformatics datasets. Dr. Lancashire’s expertise lies in applying AI and neural network techniques to high-dimensional biomedical and 'omics' data for biomarker and drug target Read More

Dr. Lee Lancashire is a data scientist with over 20 years of experience in machine learning and statistics, currently serving as Head of Data Science at Every Cure. He earned his PhD specializing in artificial neural network methodologies, developing some of the first applications of neural nets to large-scale bioinformatics datasets. Dr. Lancashire’s expertise lies in applying AI and neural network techniques to high-dimensional biomedical and 'omics' data for biomarker and drug target discovery. He has held senior leadership roles in both nonprofit and industry settings, notably as Chief Data & AI Officer at Cohen Veterans Bioscience, and as global practice lead for machine learning in life science at Thomson Reuters/Clarivate Analytics. He has authored over 40 publications and holds several patents related to neural network applications in biomedicine.

Organized by
NIH Library
Description

The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem. 

This one hour and half online training builds on Read More

The "Data Visualization in R" series focuses on using ggplot and the broader tidyverse ecosystem to create insightful and customizable visualizations. It covers key principles of data visualization, from basic plots to advanced techniques, emphasizing the flexibility and power of ggplot within a tidy data workflow. By the end of the series, participants will be proficient in building plots using the tidyverse ecosystem. 

This one hour and half online training builds on the topics covered in the Data Visualization in ggplot training. This training emphasizes advanced customization techniques in ggplot, to create effective and clear visualizations. Participants will build on the foundational skills learned in Part 1 of the series and apply various customization options, such as faceting, labeling, themes, and color scales.  You must have taken Data Visualization in R: Introduction to ggplot: Part 1 of 2 training to be successful in this training.  

By the end of this training, attendees should be able to:  

  • Create a scatterplot in ggplot 
  • Learn how to facet a plot 
  • Demonstrate options for customizing the title and axis 
  • Apply different ggplot themes Attendees are expected to have a basic understanding of R and RStudio. To proceed, attendees should have done the following:
  • Installed R and RStudio.
  • Have a basic understanding of R and RStudio.
  • Reviewed our R basics training on the NIH Data Services: On Demand Content YouTube Playlist, if you are new to R.
  • You can register for the training in this series via the link below:  
  • Data Visualization in R:  introduction to ggplot Part 1 of 2 
Organized by
BTEP
Description

Globus is a GUI-based software suitable for efficiently transferring large datasets such as those generated from Next Generation Sequencing (NGS) to and from high performance computing systems  (ie. NIH’s Biowulf). This demonstration only class will show participants how to access and setup Globus for transferring data from local computer as well as sequencing facility data management environment (DME) to Biowulf. Current Biowulf users who have no or limited experience with Globus Read More

Globus is a GUI-based software suitable for efficiently transferring large datasets such as those generated from Next Generation Sequencing (NGS) to and from high performance computing systems  (ie. NIH’s Biowulf). This demonstration only class will show participants how to access and setup Globus for transferring data from local computer as well as sequencing facility data management environment (DME) to Biowulf. Current Biowulf users who have no or limited experience with Globus will find this class extremely useful. This class is open only to NIH staff. Meeting link will be provided upon approval of registration.

Registeration link: https://cbiit.webex.com/weblink/register/re8dc373ea594662a5e3b0e92a71582fb

Distinguished Speakers Seminar Series

Organized by
BTEP
Description

The role of computational science in biomedical research has typically been downstream of experiments, where it plays important roles in signal processing, data integration, pattern detection, and hypothesis testing. But this is changing, and predictive models are now being used to generate and test hypotheses in silico. In this talk, Dr. Pollard will share examples from human genetics, where they have built deep learning models of 3D chromatin interactions that take only Read More

The role of computational science in biomedical research has typically been downstream of experiments, where it plays important roles in signal processing, data integration, pattern detection, and hypothesis testing. But this is changing, and predictive models are now being used to generate and test hypotheses in silico. In this talk, Dr. Pollard will share examples from human genetics, where they have built deep learning models of 3D chromatin interactions that take only sequence as input and then used them to interpret disease variants. This strategy leads to causal hypotheses and enables them to prioritize variants with predicted functional effects. Experiments designed using model outputs are accelerating the rate of discoveries, shedding light on genetic mechanisms in cancer and developmental disorders. This prediction-first strategy exemplifies Dr. Pollard's vision for a more proactive, rather than reactive, role for computational science in biomedical research.

Organized by
NIDDK
Description

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives:

Be able to identify, load, and use R resources/packages based upon needs and experience level with R.  

1. For beginners, know how to load R Commander, import data, and navigate the GUI.  

2. For those interested in Read More

NIDDK Biostats Seminar Series: From Research Study Design to Collecting, Managing, and Analyzing Data.

Learning Objectives:

Be able to identify, load, and use R resources/packages based upon needs and experience level with R.  

1. For beginners, know how to load R Commander, import data, and navigate the GUI.  

2. For those interested in learning more about coding/functions, how to use R Swirl to learn foundations for functions and coding higher level operations (loops, combining functions, and building new functions).  

3. For regular users of R, how to use tidyverse for data manipulation, organization, and preparation for analysis.  

4. For those using R for research work, how to utilize R Markdown for appropriate and thorough project documentation and management.  

 

Outline:

2:30-3:00pm –Beginner level (Dr. Wilkins, Mathematical Statistician, Biostatistics Program Office, NIDDK)

How to get the basics accomplished: load data, navigate RCommander GUI, and export data.

3:00– 3:30pm – Intermediate level (Dr. Leary, Chief, Biostatistics Program Office, NIDDK)

Data manipulation and organization for analysis with focus on tools for more complex coding and functionality.

3:30-4:00pm – Advanced topics (Dr. Leary)

Leveraging R Markdown and other resources for project management, documentation, and archiving. 

August

Organized by
NIH Library
Description

This one-hour online training will provide a high-level overview of Python coding concepts, as well as some of the integrative development environments (IDEs, such as Jupyter notebooks) used for Python coding. Python is a programming language used for data science, specifically: data analysis, statistical analysis, and visualization of results. The training will feature the following IDEs: Google Colaboratory: Jupyter Notebook; and Anaconda’s: Spyder, Jupyter Notebook, and JupyterLab. Read More

This one-hour online training will provide a high-level overview of Python coding concepts, as well as some of the integrative development environments (IDEs, such as Jupyter notebooks) used for Python coding. Python is a programming language used for data science, specifically: data analysis, statistical analysis, and visualization of results. The training will feature the following IDEs: Google Colaboratory: Jupyter Notebook; and Anaconda’s: Spyder, Jupyter Notebook, and JupyterLab. This overview training will demonstrate how these skills can boost productivity, rigor, and transparency in reporting research findings.  

By the end of the training, attendees will be able to: 

  • Recognize four freely available IDEs for python coding 

  • Identify fundamental components of python code 

  • Understand how and why notebooks support rigor and transparency in analysis 

Attendees are not expected to have any prior knowledge of python coding or the IDEs to be successful in this training.  

If you choose to follow along with Google Colab or Jupyter Notebooks, these IDEs should be installed and ready to go. Code will be provided during the training for this option. 

Organized by
BTEP
Description

This class benefits new staff or bioinformatics newcomers and will introduce educational resources, software (i.e. Unix, Biowulf, R, Python, commercial software, and cloud), and support communities for bioinformatics at NIH. Attendance is restricted to NIH staff. This class is a part of the BTEP Introduction to Bioinformatics Summer Series. Meeting link will be provided upon approval of registration.

 

Registration:

This class benefits new staff or bioinformatics newcomers and will introduce educational resources, software (i.e. Unix, Biowulf, R, Python, commercial software, and cloud), and support communities for bioinformatics at NIH. Attendance is restricted to NIH staff. This class is a part of the BTEP Introduction to Bioinformatics Summer Series. Meeting link will be provided upon approval of registration.

 

Registration: https://cbiit.webex.com/weblink/register/rc4244992943cae758d346bb5ce6004ed

Organized by
BTEP
Description

Biowulf isthe NIH Unix-based high-performance computer (HPC). Unix is a command driven computer operating system. Biowulf has more computational power than the average laptop, which makes it ideal for analyzing Next Generation Sequencing (NGS) data. This class enables participants to start working on Biowulf by introducing essential Unix commands. A Biowulf account is not required for participation, however Biowulf staff will provide 40 temporary student accounts for those who wish to follow along. Participants can Read More

Biowulf isthe NIH Unix-based high-performance computer (HPC). Unix is a command driven computer operating system. Biowulf has more computational power than the average laptop, which makes it ideal for analyzing Next Generation Sequencing (NGS) data. This class enables participants to start working on Biowulf by introducing essential Unix commands. A Biowulf account is not required for participation, however Biowulf staff will provide 40 temporary student accounts for those who wish to follow along. Participants can also use personal Biowulf account or obtain one at https://hpc.nih.gov/docs/accounts.html. Attendance is restricted to NIH staff and meeting link will be provided upon approval of registration. This session is a part of the BTEP Introduction to Bioinformatics Summer Series.

 

Registration: https://cbiit.webex.com/weblink/register/r33551db0a06dbcf7c6aeae6ff52f852f

Organized by
BTEP
Description

This class provides an overview of the R and Python programming languages, and how each is used in bioinformatics research. Participants will learn the advantages of each language and how to choose which is most applicable to a data analyses. Learning resources for beginners will be provided and questions answered.  Attendance is restricted to NIH staff. This class is not hands-on. Meeting link will be provided upon approval of registration.

 

Read More

This class provides an overview of the R and Python programming languages, and how each is used in bioinformatics research. Participants will learn the advantages of each language and how to choose which is most applicable to a data analyses. Learning resources for beginners will be provided and questions answered.  Attendance is restricted to NIH staff. This class is not hands-on. Meeting link will be provided upon approval of registration.

 

Registration: https://cbiit.webex.com/weblink/register/rd1761836833d5b5790d978418e4eecf2

Organized by
BTEP
Description

This class will address best practices for managing, organizing, and sharing data so they become Findable, Accessible, Interoperable, and Reusable (FAIR), which is important for advancing science as well as meeting data sharing policy requirements. Participants will learn about recommended file structures and formats, storage methods (local versus cloud), and guidelines for sharing to make data abide by the FAIR principals. Attendance is restricted to NIH staff. Meeting link will be provided upon registration Read More

This class will address best practices for managing, organizing, and sharing data so they become Findable, Accessible, Interoperable, and Reusable (FAIR), which is important for advancing science as well as meeting data sharing policy requirements. Participants will learn about recommended file structures and formats, storage methods (local versus cloud), and guidelines for sharing to make data abide by the FAIR principals. Attendance is restricted to NIH staff. Meeting link will be provided upon registration approval. This class is a part of the BTEP Introduction to Bioinformatics summer series.

 

Registration: https://cbiit.webex.com/weblink/register/rab14b78668a94345b80ddd139b3f6e88

September

Organized by
BTEP
Description

Jupyter Lab is a platform to organize code and analysis steps in one place, allowing users to easily keep track of all steps taken in an analysis, thereby facilitating collaboration and research presentation. This class is a demo and not hands-on. Participants will learn how to access Jupyter Lab and steps involved in producing reproducible analysis reports using this software. Experience using or installation of Jupyter Lab is not needed to participate. Attendance is Read More

Jupyter Lab is a platform to organize code and analysis steps in one place, allowing users to easily keep track of all steps taken in an analysis, thereby facilitating collaboration and research presentation. This class is a demo and not hands-on. Participants will learn how to access Jupyter Lab and steps involved in producing reproducible analysis reports using this software. Experience using or installation of Jupyter Lab is not needed to participate. Attendance is restricted to NIH staff. Meeting link will be provided upon approval of registration. This session is a part of the BTEP Introduction to Bioinformatics Summer Series.

 

Registration: https://cbiit.webex.com/weblink/register/re16ac11f4d295ca6f9c6cd061790316c

Organized by
NHLBI
Description

Join the National Heart, Lung, and Blood Institute (NHLBI) for a hybrid workshop to explore how medicine will be transformed by the current artificial intelligence (AI) revolution. Participants will engage with leading experts to learn the current state and visionary outlook of the field and to identify research gaps and opportunities in AI, focusing on clinical decision support.

The primary aim of this workshop is to Read More

Join the National Heart, Lung, and Blood Institute (NHLBI) for a hybrid workshop to explore how medicine will be transformed by the current artificial intelligence (AI) revolution. Participants will engage with leading experts to learn the current state and visionary outlook of the field and to identify research gaps and opportunities in AI, focusing on clinical decision support.

The primary aim of this workshop is to explore how AI can be utilized to aid in diagnosing and treating heart, lung, blood, and sleep disorders (HLBS). AI is a computer science field focused on creating systems that perform tasks requiring human intelligence. These tasks include learning, reasoning, problem-solving, perception, language understanding, and interaction. AI includes subfields like machine learning, natural language processing, robotics, and computer vision. In biomedical research and health care, AI analyzes complex datasets, enhances diagnostic accuracy, personalizes treatment plans, and improves healthcare delivery. 

Additionally, the workshop aligns with the broader mission of NHLBI to promote the prevention and treatment of heart, lung, and blood diseases, and enhance the health of all individuals so that they can live longer and more fulfilling lives.

Organized by
NCI Office of Data Sharing
Description

Please use this link to access overview, registration, and other information:

https://events.cancer.gov/nci/ods-data-jamboree

Childhood cancer is a rare disease with ~15,000 cases diagnosed annually in the United States in individuals younger than 20 years. Despite extensive efforts made over the last two decade by programs such as National Institutes of Health (NIH)'s Gabriela Miller Kids First Programand&Read More

Please use this link to access overview, registration, and other information:

https://events.cancer.gov/nci/ods-data-jamboree

Childhood cancer is a rare disease with ~15,000 cases diagnosed annually in the United States in individuals younger than 20 years. Despite extensive efforts made over the last two decade by programs such as National Institutes of Health (NIH)'s Gabriela Miller Kids First Programand NCI's Therapeutically Applicable Research to Generate Effective Treatments (TARGET) and Childhood Cancer Data Initiative (CCDI) to generate, collect and share the data, pediatric and AYA cancer datasets remain underutilized. Finding and accessing datasets, building specific pediatric cancer cohorts, and aggregating or linking datasets from various data systems still present tremendous challenges for the wider community. To overcome these barriers and raise awareness of existing childhood cancer data resources to inform better diagnosis and treatment options for children, this data jamboree is to bring together researchers and citizen scientists with diverse expertise and experience to collaborate and explore scientific or other questions using childhood cancer data. The goals of the jamboree include:

  • Promoting access and reuse of pediatric cancer data and raising awareness about the availability of these datasets.
  • Promoting interdisciplinary collaborations to expand the size, technical, and scientific diversity of the pediatric cancer research community.
  • Promoting development of new methods and tools for data analysis.
  • Identifying gaps and limitations of existing data and resources including barriers to real time access to the data.