ncibtep@nih.gov

Bioinformatics Training and Education Program

BTEP Seminar Series

2024 Seminar Series

Version control using Github

  • When: February 21, 2024
  • Delivery: Online
  • Presented By: Joe Wu (BTEP), Nadim Rizk (CBIIT)
  • Versioning enables researchers to track changes in coding projects. This Coding Club session will introduce the versioning tool GitHub (https://github.com). At the end of this class, participants will

    • Become familiar with options available for using GitHub at NCI
    • Be able to use GitHub to
      • Create coding projects 
      • Track changes in code
      • Revert to a previous version of code
      • Collaborate with the project team

     

    Installation of software is not needed to participate.

    This class will be followed by one addressing versioning using Git on February 28, 2024 from 11 AM to 12 PM. See https://bioinformatics.ccr.cancer.gov/btep/classes/version-control-using-git for information and registration.

    Meeting information:

    Meeting link:
    https://cbiit.webex.com/cbiit/j.php?MTID=meadb08ed71552393fe486073a7a7ffc5 
    Meeting number:
    2308 646 3414
    Password:
    VRjdm9A5y$4

    Join by video system
    Dial 23086463414@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.

    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2308 646 3414

    Global call-in options
    https://cbiit.webex.com/webappng/sites/cbiit/meeting/info/e453fc36a706405db9991abd0f97f7bb#

speaker series image

Version control using Git

  • When: February 28, 2024
  • Delivery: Online
  • Presented By: Joe Wu (BTEP)
  • Versioning enables researchers to track changes in coding projects. This Coding Club session will introduce Git (https://git-scm.com), an open-source software used to perform versioning locally and enables users to upload code to web repositories such as GitHub. At the end of this class, participants will

    • Be able to describe Git
    • Be able to use Git to
      • Create coding projects 
      • Save and track changes to code
      • Upload code to GitHub
      • Revert to/view previous versions of code
      • Perform basic collaboration tasks

     

    Installation of software is not needed to participate.

    Meeting information:

    Meeting link:
    https://cbiit.webex.com/cbiit/j.php?MTID=meadb08ed71552393fe486073a7a7ffc5 
    Meeting number:
    2308 646 3414
    Password:
    VRjdm9A5y$4

    Join by video system
    Dial 23086463414@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.

    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2308 646 3414

    Global call-in options
    https://cbiit.webex.com/webappng/sites/cbiit/meeting/info/e453fc36a706405db9991abd0f97f7bb#

speaker series image

Artificial Intelligence in the Biomedical Sciences

  • When: February 29, 2024
  • Delivery: Online
  • Presented By: Brian Ondov, Ph.D. (NLM)
  • Artificial Intelligence (AI) is becoming increasingly ubiquitous in biomedical research, enabled by large datasets, new algorithms, and hardware improvements. In this session, Dr. Brian Ondov will introduce the basic principles of AI and describe how its various forms can help researchers in different ways, including image classification, sequence-based prediction, generative models, and language understanding. 

     

speaker series image

Explainable Artificial Intelligence (XAI) and Single Cell Genomics to Understand the Cellular Complexity of the Human Brain

  • When: April 4, 2024
  • Delivery: Online
  • Presented By: Richard Scheuermann, Ph.D. (NLM)
  • Explainable Artificial Intelligence (XAI) and Single Cell Genomics to Understand the Cellular Complexity of the Human Brain

speaker series image

Casey Greene

  • When: April 11, 2024
  • Delivery: Online
  • Presented By: Casey Greene, Ph.D., (CU Anschutz)
  • Dr. Greene's lab develops computational methods that integrate distinct large-scale datasets to extract the rich and intrinsic information embedded in such integrated data. This approach reveals underlying principles of an organism’s genetics, its environment, and its response to that environment. Extracting this key contextual information reveals where the data’s context doesn’t fit existing models and raises the questions that a complete collection of publicly available data indicates researchers should be asking. In addition to developing deep learning methods for extracting context, a core mission of Dr. Greene's lab is bringing these capabilities into every molecular biology lab.

speaker series image

The Future of Healthcare: How AI and ChatGPT are Changing the Game in Medicine

  • When: May 2, 2024
  • Delivery: Online
  • Presented By: Dr. Zhiyong Lu (NCBI)
  • The explosion of biomedical big data and information in the past decade or so has created new opportunities for discoveries to improve the treatment and prevention of human diseases. As such, the field of medicine is undergoing a paradigm shift driven by AI-powered analytical solutions. This talk delves into the convergence of AI and ChatGPT, highlighting their pivotal roles in revolutionizing biomedical discovery, patient care, diagnosis, treatment, and medical research. By demonstrating their uses in some real-world applications such as improving PubMed searches (Fiorini et al., Nature Biotechnology 2018), supporting precision medicine (Allot et al., Nature Genetics 2023), and assisting patient trial matching, we underscore the potential of AI and ChatGPT in enhancing clinical decision-making, personalizing patient experiences, and accelerating knowledge discovery.

speaker series image

Multimodal Data Integration: From Biomarkers to Mechanisms

  • When: May 23, 2024
  • Delivery: Online
  • Presented By: Caroline Uhler, Ph.D. (MIT)
  • An exciting opportunity at the intersection of the biomedical sciences and machine learning stems from the growing availability of large-scale multi-modal data (imaging-based and sequencing-based, observational and perturbational, at the single-cell level, tissue-level, and organism-level). Traditional representation learning methods, although often highly successful in predictive tasks, do not generally elucidate underlying causal mechanisms. Dr. Uhler will present initial ideas towards building a statistical and computational framework for causal representation learning and its applications towards identifying novel disease biomarkers as well as inferring gene regulation in health and disease.

speaker series image

Faraz Faghri

  • When: June 27, 2024
  • Delivery: Online
  • Presented By: Faraz Fahri, Ph.D. (CARD)
  • CARD is a collaborative initiative of the National Institute on Aging and the National Institute of Neurological Disorders and Stroke that supports basic, translational, and clinical research on Alzheimer’s disease and related dementias. CARD’s central mission is to initiate, stimulate, accelerate, and support research that will lead to the development of improved treatments and preventions for these diseases.

speaker series image

Genomes, Avatars and AI: The Future of Personalized Medicine

  • When: August 8, 2024
  • Delivery: Online
  • Presented By: Olivier Elemento, Ph.D. (Weill Cornell Medicine)
  • The Elemento lab combines Big Data analytics with experimentation to develop entirely new ways to help prevent, diagnose, understand, treat and ultimately cure disease. Our research involves routine use of ultrafast DNA sequencing, proteomics, high-performance computing, mathematical modeling, and artificial intelligence/machine learning. We’re revolutionizing healthcare by developing innovative approaches to better predict, diagnose, treat, and prevent disease to improve clinical care for every patient.  

speaker series image

Clinical and Computational Molecular Profiling in Pediatric Cancer Diagnostics

  • When: August 29, 2024
  • Delivery: Online
  • Presented By: Elaine Mardis, Ph.D. (Nationwide Children's Hospital)
  • Dr. Mardis is an internationally recognized expert in cancer genomics, with ongoing interests in the integrated characterization of cancer genomes, defining DNA-based somatic and germline interactions and RNA-based pathways, and immune microenvironments that lead to cancer onset and progression, specifically involving pediatric cancers. Most recently, her research has been oriented toward translational aspects of cancer genomics, specifically identifying how the cancer genome changes with treatment, including acquired resistance, the use of genomics in understanding immune therapy response, and the clinical benefit of cancer molecular profiling in the pediatric setting.

speaker series image

Seth Blackshaw

  • When: November 7, 2024
  • Delivery: Online
  • Presented By: Seth Blackshaw, Ph.D. (Johns Hopkins)
  • Dr. Blackshaw's work examines the molecular basis of neuronal and glial cell fate specification and survival, focusing on characterizing the network of genes that control specification of different cell types within the retina and hypothalamus, two structures that arise from the embryonic forebrain.  The ultimate goal is to use insights gained from learning how individual cell types are specified to understand how these cells contribute to the regulation of behavior, and how they can be replaced in neurodegenerative disease.

speaker series image

Pre-clinical Evaluation of Targeted Therapies for Pediatric Cancer

  • When: November 21, 2024
  • Delivery: Online
  • Presented By: Carol Bult, Ph.D., (The Jackson Lab)
  • The primary theme of Dr. Bult's personal research program is “bridging the digital biology divide,” reflecting the critical role that informatics and computational biology play in modern biomedical research. Dr. Bult is a Principal Investigator in the Mouse Genome Informatics (MGI) consortium that develops knowledge-bases to advance the laboratory mouse as a model system for research into the genetic and genomic basis of human biology and disease. Recent research initiatives in Dr. Bult's research group include computational prediction of gene function in the mouse and the use of the mouse to understand genetic pathways in normal lung development and disease.

speaker series image

Documenting Your Analysis with Quarto Archived

  • When: January 24, 2024
  • Delivery: Online
  • Presented By: Alex Emmons (BTEP)
  • Documenting your data analysis is a crucial step toward making your research reproducible. In this session of the BTEP Coding Club, we will learn how to get started using Quarto with RStudio for report generation. 

speaker series image

2023 Seminar Series

Creating R / Python templates for the NIH Integrated Data Analysis Platform (NIDAP) Archived

  • When: December 6, 2023
  • Delivery: Online
  • Presented By: Alexei Lobanov (CCBR)
  • NIDAP, the NIH Integrated Data Analysis Platform, is a cloud-based and collaborative data aggregation and analysis platform. The NIDAP platform hosts user-friendly bioinformatics workflows (Bulk RNA-Seq, scRNA-Seq, Digital Spatial Profiling) and other component analysis and visualization tools that have been created and maintained by the NCI developer community based on open-source tools.
     
    In this BTEP Coding Club session, Alexei Lobanov, bioinformatics analyst with CCBR, will demonstrate how to create NIDAP templates, GUI-like environments that allow users to run the same code on new datasets using a point-and-click approach, from source code (R or python).
     
    Why create a NIDAP template? 1) “Templatizing” your code is easy and allows users / collaborators with no coding skills to efficiently use your code. 2) Pre-made templates encourage efficiency and reproducibility. Templates allow the user to easily create custom workflows and pipelines that can be shared with collaborators and/or applied to future data sets.
speaker series image

Visualizing multi-dimensional omics data with circular plots in R package OmicCircos Archived

  • When: November 15, 2023
  • Delivery: Online
  • Presented By: Chunhua Yan (CBIIT CGBB), Ying Hu (CBIIT CGBB)
  • This session introduces two versions of the R/ Bioconductor package OmicCircos to generate high-quality circular plots for visualizing multi-dimensional omics data:

    1. coding in the R environment for programmers;
    2. point-and-click OmicCircos R Shiny app on the Cancer Genomics Cloud (CGC) for non-programmers. 

     

    Meeting number:2310 050 3184

    Password:3sfNDMBq*66

    Join by phone

    1-650-479-3207 Call-in number (US/Canada)

    Access code: 2310 050 3184

Translating Single Cell Genomics for use in Patients after Blood and Marrow Transplantation Archived

  • When: November 2, 2023
  • Delivery: Online
  • Presented By: Scott Furlan (Fred Hutchinson Cancer Center)
  • In this seminar, Dr. Furlan will share data using single cell genomic technologies after hematopoietic cell transplantation including the molecular approaches and computational tools they have used and developed as they relate to this field.

    Meeting number:
    2302 366 1547
    Password:
    PpPs7MHM@52
    Join by video system
    Dial 23023661547@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.
    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2302 366 1547
speaker series image

Accessing data from and Submitting data to the Gene Expression Omnibus (GEO) Archived

  • When: October 18, 2023
  • Delivery: Online
  • Presented By: Joshua Meyer (CCBR)
  • This October session of the BTEP Coding club will feature a tutorial on how to access data from GEO as well as how to submit data to GEO. 

     

     


CANCELLED EVENT: Precisely Practicing Medicine from 700 Trillion Points of Data Archived

  • When: October 5, 2023
  • Delivery: Online
  • Presented By: Atul Butte, MD (UCSF)
  • There is an urgent need to take what we have learned in our new data-driven era of medicine, and use it to create a new system of precision medicine, delivering the best, safest, cost-effective preventative or therapeutic intervention at the right time, for the right patients.  Dr. Butte's teams at the University of California build and apply tools that convert trillions of points of molecular, clinical, and epidemiological data -- measured by researchers and clinicians over the past decade and now commonly termed “big data” -- into diagnostics, therapeutics, and new insights into disease.  Dr. Butte, a computer scientist and pediatrician, will highlight his center’s recent work on integrating electronic health records data from over 8 million patients across the entire University of California, and how analytics on this “real world data” can lead to new evidence for drug efficacy, new savings from better medication choices, and new methods to teach intelligence – real and artificial – to more precisely practice medicine.

     
     
speaker series image

Whole Embryo Developmental Genetics at Single Cell Resolution Archived

  • When: September 28, 2023
  • Delivery: Online
  • Presented By: Cole Trapnell (Univ. of Washington)
  • The Trapnell Lab at the University of Washington's Department of Genome Sciences studies how genomes encode the program of vertebrate development and how that program goes awry in disease. We build new tools, technologies, and software for decoding this program from large-scale single-cell experiments.

    Meeting number:
    2305 942 7068
    Password:
    XUujpgh7@72
    Join by video system
    Dial 23059427068@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.
    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2305 942 7068
speaker series image

Using rMATS for differential alternative splicing detection Archived

  • When: September 20, 2023
  • Delivery: Online
  • Presented By: Alexei Lobanov (CCBR)
  • This session of the BTEP Coding Club will focus on the tool rMATS for differential alternative splicing event detection from RNA-Seq data. This 1-hour demo will provide a detailed overview of rMATS including why you may want to use it, how to use it, and how to interpret and further use resulting outputs. 

    https://rnaseq-mats.sourceforge.io/

    Multivariate Analysis of Transcript Splicing (MATS)

    MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. The statistical model of MATS calculates the P-value and false discovery rate that the difference in the isoform ratio of a gene between two conditions exceeds a given user-defined threshold. From the RNA-Seq data, MATS can automatically detect and analyze alternative splicing events corresponding to all major types of alternative splicing patterns. MATS handles replicate RNA-Seq data from both paired and unpaired study design.


Hematopoietic stem cell-intrinsic and -extrinsic contribution to aging and clonal hematopoiesis Archived

  • When: September 14, 2023
  • Delivery: Online
  • Presented By: Jennifer Trowbridge (The Jackson Lab)
  • While there is a positive correlation between cancer and aging, the mechanisms underlying this relationship remain unclear. Clonal hematopoiesis, a benign condition that is both associated with aging and predisposes to increased risk of development of blood cancers, presents an opportunity to understand the connection between cancer and aging. This seminar will discuss emerging discoveries of mechanisms acting within the hematopoietic stem cells as well as alterations in the bone marrow microenvironment that promote clonal hematopoiesis and transformation to blood cancers.

speaker series image

Using EnhancedVolcano and ComplexHeatmap to visualize -omics data Archived

  • When: August 16, 2023
  • Delivery: Online
  • Presented By: Joe Wu (BTEP)
  • Heatmaps and volcano plots are common data visualizations in bioinformatic analyses of genomic data, such as bulk RNA-seq. While both plot types can be used to visualize gene expression, heatmaps can be used to examine expression data across samples, and in combination with clustering techniques, reveal potential patterns in the data. Volcano plots demonstrate the direction, distribution, and statistical significance of gene expression between experimental conditions (example tumor vs. non-tumor, or drug treated vs. non-treated). In this coding club, we will demonstrate how to construct these plots using the R/Bioconductor tools ComplexHeatmap and EnhancedVolcano.

     

     


A Beginners Guide to Troubleshooting R Code Archived

  • When: July 19, 2023
  • Delivery: Online
  • Presented By: Alex Emmons (BTEP)
  • This session of the BTEP Coding Club will focus on strategies for overcoming errors, warnings, and other common problems with R code. In this 1-hour tutorial targeting beginner R users, we will discuss commonly observed errors, how to find help, and how to approach and debug R code. 


Single Cell Annotation with SingleR: Macrophage-fibroblast crosstalk in lung fibrosis Archived

  • When: June 22, 2023
  • Delivery: Online
  • Presented By: Mallar Bhattacharya, M.D. (UCSF)
  • The Bhattacharya Lab at the UCSF Parnassus Campus is focused on the functional role of monocyte-derived macrophages in the onset and persistence of fibrosis in the lung. We are addressing the following major questions, with a goal of discovering new targets for therapy for acute lung injury and fibrosis:

    • What molecules released by monocyte-derived macrophages and other immune cells signal to and activate pro-fibrotic programs in parenchymal cell types such as fibroblasts and epithelial cells?
    • What reciprocal signals derive from these parenchymal cells to modify the immune response?
    • How can this pathologic crosstalk be reversed to combat fibrosis and restore lung health?
speaker series image

BTEP Coding Club: Submitting Scripts to the Biowulf Batch System Archived

  • When: June 21, 2023
  • Delivery: Online
  • Presented By: Joe Wu (BTEP)
  • Biowulf is the high-performance computing cluster (HPC) at NIH. In addition to its vast compute power, Biowulf has hundreds of bioinformatics tools and databases for analyzing Next Generation Sequencing (NGS) data. This coding club will provide participants the foundations for harnessing Biowulf’s computing power to analyze NGS data. Participants will learn to request computing resources on and to submit scripts to the Biowulf system. This class is not hands-on so no need to obtain a Biowulf account prior to attending.

     

    Meeting link:
    https://cbiit.webex.com/cbiit/j.php?MTID=m39e6aa973e1500fbac8d3516e23cfaf8


    Meeting number:
    2317 419 7733
    Password:
    yKZJuSQ*983
    Host key:
    520526

    Join by video system
    Dial 23174197733@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.


    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2317 419 7733
    Host PIN: 2784

    Global call-in numbers:
    https://cbiit.webex.com/webappng/sites/cbiit/meeting/info/431acd8d9e5f4ad79e425d4832178a31#


CellTypist v2.0: Automatic Cell Type Harmonization and Integration in Single Cell Data Archived

  • When: June 1, 2023
  • Delivery: Online
  • Presented By: Chuan Xu, Ph.D. (Teichmann Lab)
  • CellTypist was first developed as a platform for exploring tissue adaptation of cell types using scRNA-seq semi-automatic annotations. Now it's an open source tool for automated cell type annotations as well as a working group in charge of curating models and ontologies.

speaker series image

Learning and Transferring Cellular State in Single Cell Atlases Archived

  • When: May 25, 2023
  • Delivery: Online
  • Presented By: Fabian Theis (Helmholtz Munich)
  • Single-cell technologies, such as single-cell RNA sequencing (scRNA-seq), have increased the resolution achieved in the study of cellular phenotypes, allowing measurements of thousands of different genes in thousands of individual cells. This has created an opportunity to begin understanding the dynamics of the prime biological processes undergone by cells, while requiring unique computational tools. In our lab, we develop novel and innovative computational methods for single-cell data analysis. - Theis Lab

speaker series image

Functional Enrichment Analysis with clusterProfiler Archived

  • When: May 17, 2023
  • Delivery: Online
  • Presented By: Alex Emmons (BTEP)
  • Functional enrichment analysis is used to understand the biological context of gene lists or differential expression results. There are a multitude of tools available for this purpose. clusterProfiler is a popular R / Bioconductor package supporting over-representation analysis (ORA) and gene set enrichment analysis (GSEA) using up-to-date biological knowledge of genes and biological processes (GO and KEGG) and support for thousands of organisms. The latest version of clusterProfiler (v. 4.6.2) also provides a tidy interface for visualizing resulting output.

    This May 2023 session of the BTEP Coding Club will provide an overview and demo of many of the key features of the clusterProfiler R package. 


The Power of Connection: How the Cancer Research Data Commons enables researchers to connect data, computational tools, and collaborators to accelerate discovery Archived

  • When: May 4, 2023
  • Delivery: Online
  • Presented By: Brandi Davis-Dusenbery (Velsera)
  • The National Cancer Insitute (NCI) Cancer Research Data Commons (CRDC) includes petabytes of genomic, proteomic, imaging and other data that can be immediately accessed and analyzed by approved users in a secure cloud environment. In this webinar, attendees will learn how the CRDC is transforming cancer research by streamlining collaboration, democratizing access to data and increasing accessibility of complex computational algorithms. We will include a live demonstration of the Seven Bridges Cancer Genomics cloud as well as case studies of research performed in the CRDC. 
     
speaker series image

Documenting Data Analysis with Jupyter Lab Archived

  • When: April 19, 2023
  • Delivery: Online
  • Presented By: Joe Wu (BTEP)
  • This BTEP coding club will introduce beginners to Jupyter Notebook, a platform to organize code and analysis steps in one place. Jupyter Notebook can be easily installed or run in a web browser, and supports several languages such as R and Python. It provides a way to keep track of all steps in an analysis and a place for collaboration. Come learn what Jupyter Notebook can do for you. This class will not be hands-on so need to install anything to attend.


Rahul Satija: (Azimuth) Annotation of Cell Types in Single Cell Analysis of Cancer Archived

  • When: April 6, 2023
  • Delivery: Online
  • Presented By: Rahul Satija (NYU)
  • Azimuth is a web application that uses an annotated reference dataset to automate the processing, analysis, and interpretation of a new single-cell RNA-seq experiment. Azimuth leverages a 'reference-based mapping' pipeline that inputs a counts matrix of gene expression in single cells, and performs normalization, visualization, cell annotation, and differential expression (biomarker discovery). All results can be explored within the app, and easily downloaded for additional downstream analysis. - Satija Lab

    The development of Azimuth is led by the New York Genome Center Mapping Component as part of the NIH Human Biomolecular Atlas Project (HuBMAP).

    This webinar will be recorded and made available on the BTEP web site: https://bioinformatics.ccr.cancer.gov/btep/btep-video-archive-of-past-classes/ within 48 hours after the event ends. 

    Join information

    Meeting link:
    https://cbiit.webex.com/cbiit/j.php?MTID=m1ff4bc9a56dbdc18375eecaed1c280fb 
     
    Meeting number:
    2304 561 2241
    Password:
    JXrwyY4j85@
    Host key:
    183061
    Cohost:
    Alex Emmons; Amy Stonelake; Desiree Tillo; Peter Fitzgerald; Joe Wu; Carl McIntosh
    Join by video system
    Dial 23045612241@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.
    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2304 561 2241

     

speaker series image

AI Models of Cancer in Precision Medicine: Trey Ideker Archived

  • When: March 30, 2023
  • Delivery: Online
  • Presented By: Trey Ideker (UCSD)
  • AI Models of Cancer and Precision Medicine: Building a Mind for Cancer

    The long-term objective of the Ideker Lab is to create artificially intelligent, mechanistic models of cancer and neurodegenerative diseases for translation of patient data to precision diagnosis and treatment. We seek to advance this goal by addressing fundamental questions in the field: What are the genetic and molecular networks that promote disease, and how do we best chart these? How do we use knowledge of these networks in intelligent systems for predicting the effects of genotype on phenotype? – Ideker Lab, https://idekerlab.ucsd.edu/research/cancer/

    Meeting number:
    2301 489 7073
    Password:
    JVmmuxM*744
    Host key:
    809371
    Cohost:
    Alex Emmons; Amy Stonelake; Desiree Tillo; Peter Fitzgerald; Joe Wu; Carl McIntosh
    Join by video system
    Dial 23014897073@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.
    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2301 489 7073

    This webinar will be recorded and made available on the BTEP web site: https://bioinformatics.ccr.cancer.gov/btep/btep-video-archive-of-past-classes/ within 48 hours after the event ends. 

speaker series image

VLOOKUP in excel and the R programming equivalent Archived

  • When: March 15, 2023
  • Delivery: Online
  • Presented By: Alex Emmons (BTEP)
  • Do you use excel's VLOOKUP function often to merge tables or search for subsets of data in large NGS data files? If so, you may be interested in a more programmatic solution. Join us for a lesson on performing VLOOKUP in excel followed by a more reproducible solution with R programming. Whether you are interested in merging a list of gene ids with a table of functional annotations or searching for unique matches of known T-Cell Receptor sequences among output from a 10X TCR sequencing run, this tutorial will likely be useful to you.  

    This tutorial will kick off the BTEP Coding Club, which features monthly 1-hour tutorials of bioinformatics tools, software, or skills. Email us at ncibtep@nih.gov if you would like to see a topic featured by the BTEP Coding Club. 

2022 Seminar Series

A 500 Year Plan for Genetics, Epigenetics and Cell Engineering Archived

  • When: September 22, 2022
  • Delivery: Online
  • Presented By: Christopher Mason (Weill Cornell Medicine)
  • The avalanche of easy-to-create genomics data has impacted almost all areas of medicine and science, from cancer patients and microbial diagnostics to molecular monitoring for astronauts in space. In this lecture, new discoveries from RNA- and DNA-sequencing with the FDA’s SEQC study show the ability of single-molecule methods to reveal rare alleles and provide more comprehensive epigenomics maps of patients and cancers. Also, recent technologies and algorithms from our laboratory and others demonstrate that an integrative, cross-kingdom view of patients (precision metagenomics) holds unprecedented biomedical potential to discern risk, improve diagnostic accuracy, and to map both genetic and epigenetic states, as well as clonal changes in mutations with clonal hematopoiesis. Finally, these methods and molecular tools work together to guide comprehensive, longitudinal, multi-omic views of human astronaut physiology and biology in the NASA Twins Study and several other missions with SpaceX and Axiom, which lay the foundation for future, long-duration spaceflight, including sequencing, quantifying, and engineering genomes to survive on other planets over the next 500 years (https://mitpress.mit.edu/books/next-500-years).

speaker series image

Decoding Breast Cancer Progression with Single Cell Genomics Archived

  • When: July 14, 2022
  • Delivery: Online
  • Presented By: Nicholas Navin (MD Anderson Cancer Center)
  • The efforts of our laboratory are split evenly between experimental and computational biology.  We develop new experimental methods to sequence single cells and isolate rare subpopulations and develop new analytical approaches to detect variants and apply statistical methods to these data sets.  We focus mainly on breast cancer to understand the role of clonal diversity in the evolution of invasion, metastasis and response to chemotherapy.  We are also using these tools to study rare tumor cell subpopulations including circulating tumor cells and cancer stem cells.  Our goal is to understand the role of clonal diversity in tumor evolution so that we can exploit this diversity for therapeutic vulnerabilities and improve diagnostic tools and the early detection of cancer.  We fully expect that applying these tools to human patients will lead to reduced morbidity in breast cancer.

speaker series image

Mapping the Human Body One Cell at a Time Archived

  • When: June 16, 2022
  • Delivery: Online
  • Presented By:
  • Sarah Teichmann is co-founder and principal leader of the Human Cell Atlas (HCA) international consortium. The International Human Cell Atlas initiative aims to create comprehensive reference maps of all human cells to further understand health and disease.

    The 37 trillion cells of the human body have a remarkable array of specialized functions, and must cooperate and collaborate in time and space to construct a functioning human. In this talk I will describe my lab’s efforts to understand this cellular diversity through a programme of cell atlasing. Harnessing cutting edge single cell genomics, imaging and computational technologies, we investigate development, homeostasis and disease states, at scale and in 3D, with a particular focus on immunity. I will illustrate the relevance of cell atlas-ing for engineering organoids and regenerative medicine, and will share new results providing insights into pacemaker cells from the sinoatrial node of this heart. Overall I hope to illustrate the power of single cell approaches in unlocking fundamental knowledge about the human body.

speaker series image

Realizing Data Interoperability Across Basic Research, Clinical Care, and Patients Archived

  • When: April 21, 2022
  • Delivery: Online
  • Presented By: Melissa Haendel (CU Anschutz)
  • Making data reusable for discovery and shared analytics across domains is a laborious, specific-skill requiring task that most data providers do not have the resources, expertise, or perspective to perform. Equally challenged are the data re-users, who function in a landscape of bespoke schemas, formats, and coding – when they can get past understanding the licensing and access control issues. Making the most of our collective data requires partnerships between basic researchers, clinicians, patients, and informaticians, as well as sophisticated strategies to address a myriad of interoperability issues. This talk will review different communities endeavors towards these ends from across the translational spectrum.

speaker series image

Integrated Analysis of Single Cell Data Across Technologies and Modalities Archived

  • When: February 17, 2022
  • Delivery: Online
  • Presented By: Rahul Satija (NYU)
  • Our goal is to understand how cellular heterogeneity encodes the molecular structure, function, and regulation of complex biological systems. Primarily using single cell genomics, we analyze systems by profiling their most fundamental units individually – a ‘bottom-up’ approach that allows us to study how diverse groups of cells work together to drive biological processes and behaviors. – Satija Lab

    Since Dr. Satija will be presenting unpublished work in this webinar, it will not be recorded or distributed.

speaker series image

2024 Seminar Series

Casey Greene

  • When: April 11, 2024
  • Delivery: Online
  • Presented By: Casey Greene, Ph.D., (CU Anschutz)
  • Dr. Greene's lab develops computational methods that integrate distinct large-scale datasets to extract the rich and intrinsic information embedded in such integrated data. This approach reveals underlying principles of an organism’s genetics, its environment, and its response to that environment. Extracting this key contextual information reveals where the data’s context doesn’t fit existing models and raises the questions that a complete collection of publicly available data indicates researchers should be asking. In addition to developing deep learning methods for extracting context, a core mission of Dr. Greene's lab is bringing these capabilities into every molecular biology lab.

speaker series image

Multimodal Data Integration: From Biomarkers to Mechanisms

  • When: May 23, 2024
  • Delivery: Online
  • Presented By: Caroline Uhler, Ph.D. (MIT)
  • An exciting opportunity at the intersection of the biomedical sciences and machine learning stems from the growing availability of large-scale multi-modal data (imaging-based and sequencing-based, observational and perturbational, at the single-cell level, tissue-level, and organism-level). Traditional representation learning methods, although often highly successful in predictive tasks, do not generally elucidate underlying causal mechanisms. Dr. Uhler will present initial ideas towards building a statistical and computational framework for causal representation learning and its applications towards identifying novel disease biomarkers as well as inferring gene regulation in health and disease.

speaker series image

Genomes, Avatars and AI: The Future of Personalized Medicine

  • When: August 8, 2024
  • Delivery: Online
  • Presented By: Olivier Elemento, Ph.D. (Weill Cornell Medicine)
  • The Elemento lab combines Big Data analytics with experimentation to develop entirely new ways to help prevent, diagnose, understand, treat and ultimately cure disease. Our research involves routine use of ultrafast DNA sequencing, proteomics, high-performance computing, mathematical modeling, and artificial intelligence/machine learning. We’re revolutionizing healthcare by developing innovative approaches to better predict, diagnose, treat, and prevent disease to improve clinical care for every patient.  

speaker series image

Clinical and Computational Molecular Profiling in Pediatric Cancer Diagnostics

  • When: August 29, 2024
  • Delivery: Online
  • Presented By: Elaine Mardis, Ph.D. (Nationwide Children's Hospital)
  • Dr. Mardis is an internationally recognized expert in cancer genomics, with ongoing interests in the integrated characterization of cancer genomes, defining DNA-based somatic and germline interactions and RNA-based pathways, and immune microenvironments that lead to cancer onset and progression, specifically involving pediatric cancers. Most recently, her research has been oriented toward translational aspects of cancer genomics, specifically identifying how the cancer genome changes with treatment, including acquired resistance, the use of genomics in understanding immune therapy response, and the clinical benefit of cancer molecular profiling in the pediatric setting.

speaker series image

Seth Blackshaw

  • When: November 7, 2024
  • Delivery: Online
  • Presented By: Seth Blackshaw, Ph.D. (Johns Hopkins)
  • Dr. Blackshaw's work examines the molecular basis of neuronal and glial cell fate specification and survival, focusing on characterizing the network of genes that control specification of different cell types within the retina and hypothalamus, two structures that arise from the embryonic forebrain.  The ultimate goal is to use insights gained from learning how individual cell types are specified to understand how these cells contribute to the regulation of behavior, and how they can be replaced in neurodegenerative disease.

speaker series image

Pre-clinical Evaluation of Targeted Therapies for Pediatric Cancer

  • When: November 21, 2024
  • Delivery: Online
  • Presented By: Carol Bult, Ph.D., (The Jackson Lab)
  • The primary theme of Dr. Bult's personal research program is “bridging the digital biology divide,” reflecting the critical role that informatics and computational biology play in modern biomedical research. Dr. Bult is a Principal Investigator in the Mouse Genome Informatics (MGI) consortium that develops knowledge-bases to advance the laboratory mouse as a model system for research into the genetic and genomic basis of human biology and disease. Recent research initiatives in Dr. Bult's research group include computational prediction of gene function in the mouse and the use of the mouse to understand genetic pathways in normal lung development and disease.

speaker series image

2023 Seminar Series

Translating Single Cell Genomics for use in Patients after Blood and Marrow Transplantation Archived

  • When: November 2, 2023
  • Delivery: Online
  • Presented By: Scott Furlan (Fred Hutchinson Cancer Center)
  • In this seminar, Dr. Furlan will share data using single cell genomic technologies after hematopoietic cell transplantation including the molecular approaches and computational tools they have used and developed as they relate to this field.

    Meeting number:
    2302 366 1547
    Password:
    PpPs7MHM@52
    Join by video system
    Dial 23023661547@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.
    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2302 366 1547
speaker series image

CANCELLED EVENT: Precisely Practicing Medicine from 700 Trillion Points of Data Archived

  • When: October 5, 2023
  • Delivery: Online
  • Presented By: Atul Butte, MD (UCSF)
  • There is an urgent need to take what we have learned in our new data-driven era of medicine, and use it to create a new system of precision medicine, delivering the best, safest, cost-effective preventative or therapeutic intervention at the right time, for the right patients.  Dr. Butte's teams at the University of California build and apply tools that convert trillions of points of molecular, clinical, and epidemiological data -- measured by researchers and clinicians over the past decade and now commonly termed “big data” -- into diagnostics, therapeutics, and new insights into disease.  Dr. Butte, a computer scientist and pediatrician, will highlight his center’s recent work on integrating electronic health records data from over 8 million patients across the entire University of California, and how analytics on this “real world data” can lead to new evidence for drug efficacy, new savings from better medication choices, and new methods to teach intelligence – real and artificial – to more precisely practice medicine.

     
     
speaker series image

Hematopoietic stem cell-intrinsic and -extrinsic contribution to aging and clonal hematopoiesis Archived

  • When: September 14, 2023
  • Delivery: Online
  • Presented By: Jennifer Trowbridge (The Jackson Lab)
  • While there is a positive correlation between cancer and aging, the mechanisms underlying this relationship remain unclear. Clonal hematopoiesis, a benign condition that is both associated with aging and predisposes to increased risk of development of blood cancers, presents an opportunity to understand the connection between cancer and aging. This seminar will discuss emerging discoveries of mechanisms acting within the hematopoietic stem cells as well as alterations in the bone marrow microenvironment that promote clonal hematopoiesis and transformation to blood cancers.

speaker series image

The Power of Connection: How the Cancer Research Data Commons enables researchers to connect data, computational tools, and collaborators to accelerate discovery Archived

  • When: May 4, 2023
  • Delivery: Online
  • Presented By: Brandi Davis-Dusenbery (Velsera)
  • The National Cancer Insitute (NCI) Cancer Research Data Commons (CRDC) includes petabytes of genomic, proteomic, imaging and other data that can be immediately accessed and analyzed by approved users in a secure cloud environment. In this webinar, attendees will learn how the CRDC is transforming cancer research by streamlining collaboration, democratizing access to data and increasing accessibility of complex computational algorithms. We will include a live demonstration of the Seven Bridges Cancer Genomics cloud as well as case studies of research performed in the CRDC. 
     
speaker series image

AI Models of Cancer in Precision Medicine: Trey Ideker Archived

  • When: March 30, 2023
  • Delivery: Online
  • Presented By: Trey Ideker (UCSD)
  • AI Models of Cancer and Precision Medicine: Building a Mind for Cancer

    The long-term objective of the Ideker Lab is to create artificially intelligent, mechanistic models of cancer and neurodegenerative diseases for translation of patient data to precision diagnosis and treatment. We seek to advance this goal by addressing fundamental questions in the field: What are the genetic and molecular networks that promote disease, and how do we best chart these? How do we use knowledge of these networks in intelligent systems for predicting the effects of genotype on phenotype? – Ideker Lab, https://idekerlab.ucsd.edu/research/cancer/

    Meeting number:
    2301 489 7073
    Password:
    JVmmuxM*744
    Host key:
    809371
    Cohost:
    Alex Emmons; Amy Stonelake; Desiree Tillo; Peter Fitzgerald; Joe Wu; Carl McIntosh
    Join by video system
    Dial 23014897073@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.
    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2301 489 7073

    This webinar will be recorded and made available on the BTEP web site: https://bioinformatics.ccr.cancer.gov/btep/btep-video-archive-of-past-classes/ within 48 hours after the event ends. 

speaker series image

2022 Seminar Series

A 500 Year Plan for Genetics, Epigenetics and Cell Engineering Archived

  • When: September 22, 2022
  • Delivery: Online
  • Presented By: Christopher Mason (Weill Cornell Medicine)
  • The avalanche of easy-to-create genomics data has impacted almost all areas of medicine and science, from cancer patients and microbial diagnostics to molecular monitoring for astronauts in space. In this lecture, new discoveries from RNA- and DNA-sequencing with the FDA’s SEQC study show the ability of single-molecule methods to reveal rare alleles and provide more comprehensive epigenomics maps of patients and cancers. Also, recent technologies and algorithms from our laboratory and others demonstrate that an integrative, cross-kingdom view of patients (precision metagenomics) holds unprecedented biomedical potential to discern risk, improve diagnostic accuracy, and to map both genetic and epigenetic states, as well as clonal changes in mutations with clonal hematopoiesis. Finally, these methods and molecular tools work together to guide comprehensive, longitudinal, multi-omic views of human astronaut physiology and biology in the NASA Twins Study and several other missions with SpaceX and Axiom, which lay the foundation for future, long-duration spaceflight, including sequencing, quantifying, and engineering genomes to survive on other planets over the next 500 years (https://mitpress.mit.edu/books/next-500-years).

speaker series image

Decoding Breast Cancer Progression with Single Cell Genomics Archived

  • When: July 14, 2022
  • Delivery: Online
  • Presented By: Nicholas Navin (MD Anderson Cancer Center)
  • The efforts of our laboratory are split evenly between experimental and computational biology.  We develop new experimental methods to sequence single cells and isolate rare subpopulations and develop new analytical approaches to detect variants and apply statistical methods to these data sets.  We focus mainly on breast cancer to understand the role of clonal diversity in the evolution of invasion, metastasis and response to chemotherapy.  We are also using these tools to study rare tumor cell subpopulations including circulating tumor cells and cancer stem cells.  Our goal is to understand the role of clonal diversity in tumor evolution so that we can exploit this diversity for therapeutic vulnerabilities and improve diagnostic tools and the early detection of cancer.  We fully expect that applying these tools to human patients will lead to reduced morbidity in breast cancer.

speaker series image

Mapping the Human Body One Cell at a Time Archived

  • When: June 16, 2022
  • Delivery: Online
  • Presented By:
  • Sarah Teichmann is co-founder and principal leader of the Human Cell Atlas (HCA) international consortium. The International Human Cell Atlas initiative aims to create comprehensive reference maps of all human cells to further understand health and disease.

    The 37 trillion cells of the human body have a remarkable array of specialized functions, and must cooperate and collaborate in time and space to construct a functioning human. In this talk I will describe my lab’s efforts to understand this cellular diversity through a programme of cell atlasing. Harnessing cutting edge single cell genomics, imaging and computational technologies, we investigate development, homeostasis and disease states, at scale and in 3D, with a particular focus on immunity. I will illustrate the relevance of cell atlas-ing for engineering organoids and regenerative medicine, and will share new results providing insights into pacemaker cells from the sinoatrial node of this heart. Overall I hope to illustrate the power of single cell approaches in unlocking fundamental knowledge about the human body.

speaker series image

Realizing Data Interoperability Across Basic Research, Clinical Care, and Patients Archived

  • When: April 21, 2022
  • Delivery: Online
  • Presented By: Melissa Haendel (CU Anschutz)
  • Making data reusable for discovery and shared analytics across domains is a laborious, specific-skill requiring task that most data providers do not have the resources, expertise, or perspective to perform. Equally challenged are the data re-users, who function in a landscape of bespoke schemas, formats, and coding – when they can get past understanding the licensing and access control issues. Making the most of our collective data requires partnerships between basic researchers, clinicians, patients, and informaticians, as well as sophisticated strategies to address a myriad of interoperability issues. This talk will review different communities endeavors towards these ends from across the translational spectrum.

speaker series image

Integrated Analysis of Single Cell Data Across Technologies and Modalities Archived

  • When: February 17, 2022
  • Delivery: Online
  • Presented By: Rahul Satija (NYU)
  • Our goal is to understand how cellular heterogeneity encodes the molecular structure, function, and regulation of complex biological systems. Primarily using single cell genomics, we analyze systems by profiling their most fundamental units individually – a ‘bottom-up’ approach that allows us to study how diverse groups of cells work together to drive biological processes and behaviors. – Satija Lab

    Since Dr. Satija will be presenting unpublished work in this webinar, it will not be recorded or distributed.

speaker series image

2023 Seminar Series

Whole Embryo Developmental Genetics at Single Cell Resolution Archived

  • When: September 28, 2023
  • Delivery: Online
  • Presented By: Cole Trapnell (Univ. of Washington)
  • The Trapnell Lab at the University of Washington's Department of Genome Sciences studies how genomes encode the program of vertebrate development and how that program goes awry in disease. We build new tools, technologies, and software for decoding this program from large-scale single-cell experiments.

    Meeting number:
    2305 942 7068
    Password:
    XUujpgh7@72
    Join by video system
    Dial 23059427068@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.
    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2305 942 7068
speaker series image

Single Cell Annotation with SingleR: Macrophage-fibroblast crosstalk in lung fibrosis Archived

  • When: June 22, 2023
  • Delivery: Online
  • Presented By: Mallar Bhattacharya, M.D. (UCSF)
  • The Bhattacharya Lab at the UCSF Parnassus Campus is focused on the functional role of monocyte-derived macrophages in the onset and persistence of fibrosis in the lung. We are addressing the following major questions, with a goal of discovering new targets for therapy for acute lung injury and fibrosis:

    • What molecules released by monocyte-derived macrophages and other immune cells signal to and activate pro-fibrotic programs in parenchymal cell types such as fibroblasts and epithelial cells?
    • What reciprocal signals derive from these parenchymal cells to modify the immune response?
    • How can this pathologic crosstalk be reversed to combat fibrosis and restore lung health?
speaker series image

CellTypist v2.0: Automatic Cell Type Harmonization and Integration in Single Cell Data Archived

  • When: June 1, 2023
  • Delivery: Online
  • Presented By: Chuan Xu, Ph.D. (Teichmann Lab)
  • CellTypist was first developed as a platform for exploring tissue adaptation of cell types using scRNA-seq semi-automatic annotations. Now it's an open source tool for automated cell type annotations as well as a working group in charge of curating models and ontologies.

speaker series image

Learning and Transferring Cellular State in Single Cell Atlases Archived

  • When: May 25, 2023
  • Delivery: Online
  • Presented By: Fabian Theis (Helmholtz Munich)
  • Single-cell technologies, such as single-cell RNA sequencing (scRNA-seq), have increased the resolution achieved in the study of cellular phenotypes, allowing measurements of thousands of different genes in thousands of individual cells. This has created an opportunity to begin understanding the dynamics of the prime biological processes undergone by cells, while requiring unique computational tools. In our lab, we develop novel and innovative computational methods for single-cell data analysis. - Theis Lab

speaker series image

Rahul Satija: (Azimuth) Annotation of Cell Types in Single Cell Analysis of Cancer Archived

  • When: April 6, 2023
  • Delivery: Online
  • Presented By: Rahul Satija (NYU)
  • Azimuth is a web application that uses an annotated reference dataset to automate the processing, analysis, and interpretation of a new single-cell RNA-seq experiment. Azimuth leverages a 'reference-based mapping' pipeline that inputs a counts matrix of gene expression in single cells, and performs normalization, visualization, cell annotation, and differential expression (biomarker discovery). All results can be explored within the app, and easily downloaded for additional downstream analysis. - Satija Lab

    The development of Azimuth is led by the New York Genome Center Mapping Component as part of the NIH Human Biomolecular Atlas Project (HuBMAP).

    This webinar will be recorded and made available on the BTEP web site: https://bioinformatics.ccr.cancer.gov/btep/btep-video-archive-of-past-classes/ within 48 hours after the event ends. 

    Join information

    Meeting link:
    https://cbiit.webex.com/cbiit/j.php?MTID=m1ff4bc9a56dbdc18375eecaed1c280fb 
     
    Meeting number:
    2304 561 2241
    Password:
    JXrwyY4j85@
    Host key:
    183061
    Cohost:
    Alex Emmons; Amy Stonelake; Desiree Tillo; Peter Fitzgerald; Joe Wu; Carl McIntosh
    Join by video system
    Dial 23045612241@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.
    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2304 561 2241

     

speaker series image

2024 Seminar Series

Version control using Github

  • When: February 21, 2024
  • Delivery: Online
  • Presented By: Joe Wu (BTEP), Nadim Rizk (CBIIT)
  • Versioning enables researchers to track changes in coding projects. This Coding Club session will introduce the versioning tool GitHub (https://github.com). At the end of this class, participants will

    • Become familiar with options available for using GitHub at NCI
    • Be able to use GitHub to
      • Create coding projects 
      • Track changes in code
      • Revert to a previous version of code
      • Collaborate with the project team

     

    Installation of software is not needed to participate.

    This class will be followed by one addressing versioning using Git on February 28, 2024 from 11 AM to 12 PM. See https://bioinformatics.ccr.cancer.gov/btep/classes/version-control-using-git for information and registration.

    Meeting information:

    Meeting link:
    https://cbiit.webex.com/cbiit/j.php?MTID=meadb08ed71552393fe486073a7a7ffc5 
    Meeting number:
    2308 646 3414
    Password:
    VRjdm9A5y$4

    Join by video system
    Dial 23086463414@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.

    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2308 646 3414

    Global call-in options
    https://cbiit.webex.com/webappng/sites/cbiit/meeting/info/e453fc36a706405db9991abd0f97f7bb#

speaker series image

Version control using Git

  • When: February 28, 2024
  • Delivery: Online
  • Presented By: Joe Wu (BTEP)
  • Versioning enables researchers to track changes in coding projects. This Coding Club session will introduce Git (https://git-scm.com), an open-source software used to perform versioning locally and enables users to upload code to web repositories such as GitHub. At the end of this class, participants will

    • Be able to describe Git
    • Be able to use Git to
      • Create coding projects 
      • Save and track changes to code
      • Upload code to GitHub
      • Revert to/view previous versions of code
      • Perform basic collaboration tasks

     

    Installation of software is not needed to participate.

    Meeting information:

    Meeting link:
    https://cbiit.webex.com/cbiit/j.php?MTID=meadb08ed71552393fe486073a7a7ffc5 
    Meeting number:
    2308 646 3414
    Password:
    VRjdm9A5y$4

    Join by video system
    Dial 23086463414@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.

    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2308 646 3414

    Global call-in options
    https://cbiit.webex.com/webappng/sites/cbiit/meeting/info/e453fc36a706405db9991abd0f97f7bb#

speaker series image

Documenting Your Analysis with Quarto Archived

  • When: January 24, 2024
  • Delivery: Online
  • Presented By: Alex Emmons (BTEP)
  • Documenting your data analysis is a crucial step toward making your research reproducible. In this session of the BTEP Coding Club, we will learn how to get started using Quarto with RStudio for report generation. 

speaker series image

2023 Seminar Series

Creating R / Python templates for the NIH Integrated Data Analysis Platform (NIDAP) Archived

  • When: December 6, 2023
  • Delivery: Online
  • Presented By: Alexei Lobanov (CCBR)
  • NIDAP, the NIH Integrated Data Analysis Platform, is a cloud-based and collaborative data aggregation and analysis platform. The NIDAP platform hosts user-friendly bioinformatics workflows (Bulk RNA-Seq, scRNA-Seq, Digital Spatial Profiling) and other component analysis and visualization tools that have been created and maintained by the NCI developer community based on open-source tools.
     
    In this BTEP Coding Club session, Alexei Lobanov, bioinformatics analyst with CCBR, will demonstrate how to create NIDAP templates, GUI-like environments that allow users to run the same code on new datasets using a point-and-click approach, from source code (R or python).
     
    Why create a NIDAP template? 1) “Templatizing” your code is easy and allows users / collaborators with no coding skills to efficiently use your code. 2) Pre-made templates encourage efficiency and reproducibility. Templates allow the user to easily create custom workflows and pipelines that can be shared with collaborators and/or applied to future data sets.
speaker series image

Visualizing multi-dimensional omics data with circular plots in R package OmicCircos Archived

  • When: November 15, 2023
  • Delivery: Online
  • Presented By: Chunhua Yan (CBIIT CGBB), Ying Hu (CBIIT CGBB)
  • This session introduces two versions of the R/ Bioconductor package OmicCircos to generate high-quality circular plots for visualizing multi-dimensional omics data:

    1. coding in the R environment for programmers;
    2. point-and-click OmicCircos R Shiny app on the Cancer Genomics Cloud (CGC) for non-programmers. 

     

    Meeting number:2310 050 3184

    Password:3sfNDMBq*66

    Join by phone

    1-650-479-3207 Call-in number (US/Canada)

    Access code: 2310 050 3184

Accessing data from and Submitting data to the Gene Expression Omnibus (GEO) Archived

  • When: October 18, 2023
  • Delivery: Online
  • Presented By: Joshua Meyer (CCBR)
  • This October session of the BTEP Coding club will feature a tutorial on how to access data from GEO as well as how to submit data to GEO. 

     

     


Using rMATS for differential alternative splicing detection Archived

  • When: September 20, 2023
  • Delivery: Online
  • Presented By: Alexei Lobanov (CCBR)
  • This session of the BTEP Coding Club will focus on the tool rMATS for differential alternative splicing event detection from RNA-Seq data. This 1-hour demo will provide a detailed overview of rMATS including why you may want to use it, how to use it, and how to interpret and further use resulting outputs. 

    https://rnaseq-mats.sourceforge.io/

    Multivariate Analysis of Transcript Splicing (MATS)

    MATS is a computational tool to detect differential alternative splicing events from RNA-Seq data. The statistical model of MATS calculates the P-value and false discovery rate that the difference in the isoform ratio of a gene between two conditions exceeds a given user-defined threshold. From the RNA-Seq data, MATS can automatically detect and analyze alternative splicing events corresponding to all major types of alternative splicing patterns. MATS handles replicate RNA-Seq data from both paired and unpaired study design.


Using EnhancedVolcano and ComplexHeatmap to visualize -omics data Archived

  • When: August 16, 2023
  • Delivery: Online
  • Presented By: Joe Wu (BTEP)
  • Heatmaps and volcano plots are common data visualizations in bioinformatic analyses of genomic data, such as bulk RNA-seq. While both plot types can be used to visualize gene expression, heatmaps can be used to examine expression data across samples, and in combination with clustering techniques, reveal potential patterns in the data. Volcano plots demonstrate the direction, distribution, and statistical significance of gene expression between experimental conditions (example tumor vs. non-tumor, or drug treated vs. non-treated). In this coding club, we will demonstrate how to construct these plots using the R/Bioconductor tools ComplexHeatmap and EnhancedVolcano.

     

     


A Beginners Guide to Troubleshooting R Code Archived

  • When: July 19, 2023
  • Delivery: Online
  • Presented By: Alex Emmons (BTEP)
  • This session of the BTEP Coding Club will focus on strategies for overcoming errors, warnings, and other common problems with R code. In this 1-hour tutorial targeting beginner R users, we will discuss commonly observed errors, how to find help, and how to approach and debug R code. 


BTEP Coding Club: Submitting Scripts to the Biowulf Batch System Archived

  • When: June 21, 2023
  • Delivery: Online
  • Presented By: Joe Wu (BTEP)
  • Biowulf is the high-performance computing cluster (HPC) at NIH. In addition to its vast compute power, Biowulf has hundreds of bioinformatics tools and databases for analyzing Next Generation Sequencing (NGS) data. This coding club will provide participants the foundations for harnessing Biowulf’s computing power to analyze NGS data. Participants will learn to request computing resources on and to submit scripts to the Biowulf system. This class is not hands-on so no need to obtain a Biowulf account prior to attending.

     

    Meeting link:
    https://cbiit.webex.com/cbiit/j.php?MTID=m39e6aa973e1500fbac8d3516e23cfaf8


    Meeting number:
    2317 419 7733
    Password:
    yKZJuSQ*983
    Host key:
    520526

    Join by video system
    Dial 23174197733@cbiit.webex.com
    You can also dial 173.243.2.68 and enter your meeting number.


    Join by phone
    1-650-479-3207 Call-in number (US/Canada)
    Access code: 2317 419 7733
    Host PIN: 2784

    Global call-in numbers:
    https://cbiit.webex.com/webappng/sites/cbiit/meeting/info/431acd8d9e5f4ad79e425d4832178a31#


Functional Enrichment Analysis with clusterProfiler Archived

  • When: May 17, 2023
  • Delivery: Online
  • Presented By: Alex Emmons (BTEP)
  • Functional enrichment analysis is used to understand the biological context of gene lists or differential expression results. There are a multitude of tools available for this purpose. clusterProfiler is a popular R / Bioconductor package supporting over-representation analysis (ORA) and gene set enrichment analysis (GSEA) using up-to-date biological knowledge of genes and biological processes (GO and KEGG) and support for thousands of organisms. The latest version of clusterProfiler (v. 4.6.2) also provides a tidy interface for visualizing resulting output.

    This May 2023 session of the BTEP Coding Club will provide an overview and demo of many of the key features of the clusterProfiler R package. 


Documenting Data Analysis with Jupyter Lab Archived

  • When: April 19, 2023
  • Delivery: Online
  • Presented By: Joe Wu (BTEP)
  • This BTEP coding club will introduce beginners to Jupyter Notebook, a platform to organize code and analysis steps in one place. Jupyter Notebook can be easily installed or run in a web browser, and supports several languages such as R and Python. It provides a way to keep track of all steps in an analysis and a place for collaboration. Come learn what Jupyter Notebook can do for you. This class will not be hands-on so need to install anything to attend.


VLOOKUP in excel and the R programming equivalent Archived

  • When: March 15, 2023
  • Delivery: Online
  • Presented By: Alex Emmons (BTEP)
  • Do you use excel's VLOOKUP function often to merge tables or search for subsets of data in large NGS data files? If so, you may be interested in a more programmatic solution. Join us for a lesson on performing VLOOKUP in excel followed by a more reproducible solution with R programming. Whether you are interested in merging a list of gene ids with a table of functional annotations or searching for unique matches of known T-Cell Receptor sequences among output from a 10X TCR sequencing run, this tutorial will likely be useful to you.  

    This tutorial will kick off the BTEP Coding Club, which features monthly 1-hour tutorials of bioinformatics tools, software, or skills. Email us at ncibtep@nih.gov if you would like to see a topic featured by the BTEP Coding Club. 

2024 Seminar Series

Artificial Intelligence in the Biomedical Sciences

  • When: February 29, 2024
  • Delivery: Online
  • Presented By: Brian Ondov, Ph.D. (NLM)
  • Artificial Intelligence (AI) is becoming increasingly ubiquitous in biomedical research, enabled by large datasets, new algorithms, and hardware improvements. In this session, Dr. Brian Ondov will introduce the basic principles of AI and describe how its various forms can help researchers in different ways, including image classification, sequence-based prediction, generative models, and language understanding. 

     

speaker series image

Explainable Artificial Intelligence (XAI) and Single Cell Genomics to Understand the Cellular Complexity of the Human Brain

  • When: April 4, 2024
  • Delivery: Online
  • Presented By: Richard Scheuermann, Ph.D. (NLM)
  • Explainable Artificial Intelligence (XAI) and Single Cell Genomics to Understand the Cellular Complexity of the Human Brain

speaker series image

The Future of Healthcare: How AI and ChatGPT are Changing the Game in Medicine

  • When: May 2, 2024
  • Delivery: Online
  • Presented By: Dr. Zhiyong Lu (NCBI)
  • The explosion of biomedical big data and information in the past decade or so has created new opportunities for discoveries to improve the treatment and prevention of human diseases. As such, the field of medicine is undergoing a paradigm shift driven by AI-powered analytical solutions. This talk delves into the convergence of AI and ChatGPT, highlighting their pivotal roles in revolutionizing biomedical discovery, patient care, diagnosis, treatment, and medical research. By demonstrating their uses in some real-world applications such as improving PubMed searches (Fiorini et al., Nature Biotechnology 2018), supporting precision medicine (Allot et al., Nature Genetics 2023), and assisting patient trial matching, we underscore the potential of AI and ChatGPT in enhancing clinical decision-making, personalizing patient experiences, and accelerating knowledge discovery.

speaker series image

Faraz Faghri

  • When: June 27, 2024
  • Delivery: Online
  • Presented By: Faraz Fahri, Ph.D. (CARD)
  • CARD is a collaborative initiative of the National Institute on Aging and the National Institute of Neurological Disorders and Stroke that supports basic, translational, and clinical research on Alzheimer’s disease and related dementias. CARD’s central mission is to initiate, stimulate, accelerate, and support research that will lead to the development of improved treatments and preventions for these diseases.

speaker series image