Lesson 5: Help session
Lesson recap
In this lesson, we learned how to request an interactive session to perform compute intensive tasks on Biowulf. We also learned about bioinformatics applications that are installed on Biowulf and explored tools used in high throughput sequencig analysis.
Practice questions
Be sure to stay in your data directory for this exercise. In your data directory, create a folder called srr1553423_fastqc.
mkdir srr1553423_fastqc
We are going to download sequencing data for NCBI SRA study SRR1553423 using the sratool kit and assay quality using fastqc.
Question 1:
Can you request an interactive session with 5 gb of lscratch space?
Solution
sinteractive --gres=lscratch:5
After the interactive session has been granted, change into the srr1553423_fastqc directory.
cd srr1553423_fastqc
Question 2:
How do we load the sratoolkit and fastqc?
Solution
module load sratoolkit
module load fastqc
Question 3:
Can you download the first 10,000 reads for SRR1553423? This is paired end sequencing data.
Solution
fastq-dump --split-files -X 10000 SRR1553423
Question 4:
How many files where downloaded?
Solution
Two fastq files were downloaded.
ls
SRR1553423_1.fastq SRR1553423_2.fastq
Question 5:
How do we assess sequencing data quality?
Solution
fastqc SRR1553423_1.fastq SRR1553423_2.fastq
Question 6:
We are done with our work on the interactive session, how do we terminate this interactive session?
Solution
exit