Lesson 6: Help session
Lesson recap
In this lesson, we learned to submit a short script that downloads sequencing data from NCBI SRA and subsequently assess sequencing data quality. We then transferred the QC data to our local computer for viewing.
Practice questions
The exercises below will further help you develop proficiency in using the Unix nano editor. You will modify the script we used in the lesson to download and assess the quality of sequencing data for NCBI SRA study SRR1553423. You will also become more comfortable with transferring data from Biowulf to your local computer for viewing.
Question 1:
What is the first step to editing a new or an existing file using nano.
Solution
If you are editing a new file, then the command below will open a blank editor
nano filename
Question 2:
After you are done editing, what is are the steps to exiting nano and returning to the prompt?
Solution
Hit control x
- If you made edits, nano will ask if you like to save.
- If you choose yes to save, nano will ask you to confirm the file name.
- If you choose no to not save, then you will return to the prompt.
- If you did not make edits then you will just be returned to the prompt.
Question 3:
Create new directory in your data folder called srr1553423_fastqc and then change into this.
Solution
Change into your data directory if you are not in it
cd /data/username
mkdir srr1553423_fastqc
cd srr1553423_fastqc
Question 4:
Stay in the srr1553423_fastqc folder. Copy SRR1553606_fastqc.sh from /data/classes/BTEP/unix_on_biowulf_2023_documents/SRR1553606_fastqc to the srr1553423_fastqc folder. Change the file name to SRR1553423_fastqc.sh.
Solution
cp /data/classes/BTEP/unix_on_biowulf_2023_documents/SRR1553606_fastqc/SRR1553606_fastqc.sh .
mv SRR1553606_fastqc.sh SRR1553423_fastqc.sh
Question 5:
We need to make a few edits to SRR1553423_fastqc.sh before we can submit it as a batch job. Open the script with nano and make the necessary edits.
Solution
nano SRR1553423_fastqc.sh
Before submitting this script we need to
- change the job-name to SRR1553423_fastqc
- change the user email to your NIH email
- for those connecting to Biowulf using their own accounts, remove the following line
- "#SBATCH --partition=student"
- change the name of the output log file to SRR1553423_fastqc_log
- in the first comment line after loading modules, replace SRR1553606 with SRR1553423
- replace SRR1553606 with SRR1553423 in the fastq-dump command
- replace
- Replace the output directory for the fastqc results to /data/$USER/srr1553423_fastqc
- SRR1553606 with SRR1553423 in the fastqc command
Save these changes and exit
Question 6:
Now, submit SRR1553423_fastqc.sh as a batch job.
Solution
sbatch SRR1553423_fastqc.sh
Question 7:
List the contents of the srr1553423_fastqc folder, what are the names of the html fastqc reports?
Solution
Below are the two html fastqc reports generated for the sequencing data belonging to NCBI SRA SRR1553423.
SRR1553423_1_fastqc.html
SRR1553423_2_fastqc.html
Question 8:
Can you use scp
to copy SRR1553423_1_fastqc.html and SRR1553423_2_fastqc.html to your local computer for viewing. Use either the Mac Terminal or Windows Command Prompt for this and save it to your downloads folder.
Solution
Mac users: open the Terminal and change into your Download directory
cd ~/Downloads
Replace username below with the username you used to connect to Biowulf
scp username@helix.nih.gov:/data/username/srr1553423_fastqc/SRR1553423_1_fastqc.html .
scp username@helix.nih.gov:/data/username/srr1553423_fastqc/SRR1553423_2_fastqc.html .
Enter your password when promoted during the secure copy process.
Windows users:
Refer to Figure 17 in the Lesson 6 documentation to change into your Windows downloads folder. Then, use the scp
command to copy the html fastqc reports for SRR1553423 to your local.
Replace username below with the username you used to connect to Biowulf
scp username@helix.nih.gov:/data/username/srr1553423_fastqc/SRR1553423_1_fastqc.html .
scp username@helix.nih.gov:/data/username/srr1553423_fastqc/SRR1553423_2_fastqc.html .
Enter your password when promoted during the secure copy process.