Skip to content

Lesson 5: Help session

Lesson recap

In this lesson, we learned how to request an interactive session to perform compute intensive tasks on Biowulf. We also learned about bioinformatics applications that are installed on Biowulf and explored tools used in high throughput sequencig analysis.

Practice questions

Be sure to stay in your data directory for this exercise. In your data directory, create a folder called srr1553423_fastqc.

mkdir srr1553423_fastqc

We are going to download sequencing data for NCBI SRA study SRR1553423 using the sratool kit and assay quality using fastqc.

Question 1:

Can you request an interactive session with 5 gb of lscratch space?

Solution

sinteractive --gres=lscratch:5

After the interactive session has been granted, change into the srr1553423_fastqc directory.

cd srr1553423_fastqc

Question 2:

How do we load the sratoolkit and fastqc?

Solution

module load sratoolkit
module load fastqc

Question 3:

Can you download the first 10,000 reads for SRR1553423? This is paired end sequencing data.

Solution

fastq-dump --split-files -X 10000 SRR1553423

Question 4:

How many files where downloaded?

Solution

Two fastq files were downloaded.

ls
SRR1553423_1.fastq  SRR1553423_2.fastq

Question 5:

How do we assess sequencing data quality?

Solution

fastqc SRR1553423_1.fastq  SRR1553423_2.fastq

Question 6:

We are done with our work on the interactive session, how do we terminate this interactive session?

Solution

exit