ncibtep@nih.gov

Bioinformatics Training and Education Program

Sequence Read Archive: Leveraging this Petabyte-scale Database to Drive Biomedical Discovery

Sequence Read Archive: Leveraging this Petabyte-scale Database to Drive Biomedical Discovery

 When: Mar. 13th, 2026 12:00 pm - 1:00 pm

Learning Level: Beginner

To Know

Where:
Online
Organizer:
Data Sharing and Reuse Seminar Series
Presented By:
Derek Caetano-Anolles PhD (NCBI)

About this Class

The Sequence Read Archive (SRA) is the largest publicly available repository of high-throughput sequencing data. With big data come big challenges, and that includes keeping the SRA sustainable while making sure that data is findable, accessible, interoperable and reusable. Following a brief introduction to the SRA and the expanse of data it holds, we will share best practices for accessing SRA data for your analyses and the various formats you may encounter. Finally, we will describe the SRA Lite file format, which is faster to download with the added advantage of shrinking the overall footprint of SRA. We will demonstrate the use of SRA Lite format in NCBI RNA-seq pipelines and related analyses, and offer appropriate NCBI resources to learn more and engage with us.