Skip to content

Lesson 6 Practice

The following was designed to practice skills learned in lesson 6.

Find the data

Here is a paper examining the relationship between the oral microbiome and nasopharyngeal carcinoma. Where can you find the associated data?

Notice that the data was originally submitted to the ENA.

Download the data

Make a directory called Lesson6_practice and change directories.

Solution
mkdir Lesson6_practice  
cd Lesson6_practice  

Navigate to the NCBI website and grab the accession information for the associated BioProject.

  1. Download the first 10 samples using fastq-dump with parallel.

Solution

You can download the accession list directly from the SRA Run Selector.

From the command line:

esearch -db sra -query PRJEB37445 | efetch -format runinfo | cut -f 1 -d ',' |sort |grep "ERR" | head > accessions.txt  

cat accessions.txt | parallel -j 1 fastq-dump --split-3 {}
Or the same command with prefetch...
cat accessions.txt | parallel -j 1 'prefetch {} | fastq-dump --split-3' {}    

  1. Download five samples with the aid of the sra-explorer.

Navigate to the ENA. How might you go about downloading the data?