Skip to content

Lesson 6: Help session

Lesson recap

In this lesson, we learned to submit a short script that downloads sequencing data from NCBI SRA and subsequently assess sequencing data quality. We then transferred the QC data to our local computer for viewing.

Practice questions

The exercises below will further help you develop proficiency in using the Unix nano editor. You will modify the script we used in the lesson to download and assess the quality of sequencing data for NCBI SRA study SRR1553423. You will also become more comfortable with transferring data from Biowulf to your local computer for viewing.

Question 1:

What is the first step to editing a new or an existing file using nano.

Solution

If you are editing a new file, then the command below will open a blank editor

nano filename

Question 2:

After you are done editing, what is are the steps to exiting nano and returning to the prompt?

Solution

Hit control x

  • If you made edits, nano will ask if you like to save.
    • If you choose yes to save, nano will ask you to confirm the file name.
    • If you choose no to not save, then you will return to the prompt.
  • If you did not make edits then you will just be returned to the prompt.

Question 3:

Create new directory in your data folder called srr1553423_fastqc and then change into this.

Solution

Change into your data directory if you are not in it

cd /data/username
mkdir srr1553423_fastqc
cd srr1553423_fastqc

Question 4:

Stay in the srr1553423_fastqc folder. Copy SRR1553606_fastqc.sh from /data/classes/BTEP/unix_on_biowulf_2023_documents/SRR1553606_fastqc to the srr1553423_fastqc folder. Change the file name to SRR1553423_fastqc.sh.

Solution

cp /data/classes/BTEP/unix_on_biowulf_2023_documents/SRR1553606_fastqc/SRR1553606_fastqc.sh .
mv SRR1553606_fastqc.sh SRR1553423_fastqc.sh

Question 5:

We need to make a few edits to SRR1553423_fastqc.sh before we can submit it as a batch job. Open the script with nano and make the necessary edits.

Solution

nano SRR1553423_fastqc.sh

Before submitting this script we need to

  • change the job-name to SRR1553423_fastqc
  • change the user email to your NIH email
  • for those connecting to Biowulf using their own accounts, remove the following line
    • "#SBATCH --partition=student"
  • change the name of the output log file to SRR1553423_fastqc_log
  • in the first comment line after loading modules, replace SRR1553606 with SRR1553423
  • replace SRR1553606 with SRR1553423 in the fastq-dump command
  • replace
    • Replace the output directory for the fastqc results to /data/$USER/srr1553423_fastqc
    • SRR1553606 with SRR1553423 in the fastqc command

Save these changes and exit

Question 6:

Now, submit SRR1553423_fastqc.sh as a batch job.

Solution

sbatch SRR1553423_fastqc.sh

Question 7:

List the contents of the srr1553423_fastqc folder, what are the names of the html fastqc reports?

Solution

Below are the two html fastqc reports generated for the sequencing data belonging to NCBI SRA SRR1553423.

SRR1553423_1_fastqc.html
SRR1553423_2_fastqc.html

Question 8:

Can you use scp to copy SRR1553423_1_fastqc.html and SRR1553423_2_fastqc.html to your local computer for viewing. Use either the Mac Terminal or Windows Command Prompt for this and save it to your downloads folder.

Solution

Mac users: open the Terminal and change into your Download directory

cd ~/Downloads 

Replace username below with the username you used to connect to Biowulf

scp username@helix.nih.gov:/data/username/srr1553423_fastqc/SRR1553423_1_fastqc.html .
scp username@helix.nih.gov:/data/username/srr1553423_fastqc/SRR1553423_2_fastqc.html .

Enter your password when promoted during the secure copy process.

Windows users:

Refer to Figure 17 in the Lesson 6 documentation to change into your Windows downloads folder. Then, use the scp command to copy the html fastqc reports for SRR1553423 to your local.

Replace username below with the username you used to connect to Biowulf

scp username@helix.nih.gov:/data/username/srr1553423_fastqc/SRR1553423_1_fastqc.html .
scp username@helix.nih.gov:/data/username/srr1553423_fastqc/SRR1553423_2_fastqc.html .

Enter your password when promoted during the secure copy process.