Skip to content

Lesson 4: Practice questions

For these practice questions, check the present working directory and if needed, change into the /data/username folder (username is the student account ID).

Question 1

Copy the lesson4_practice folder from /data/classes/BTEP/unix_on_biowulf_2024_practice_sessions to the present working directory, which should be /data/username.

Solution

cp -r /data/classes/BTEP/unix_on_biowulf_2024_practice_sessions/lesson4_practice .

Question 2

Change into the lesson4_practice folder.

Solution

cd lesson4_practice

Question 3

How many folders are in this directory and what is the name of this folder?

Solution

ls -l

There is one folder called sample_sequence_data.

Question 4

Change into the folder sample_sequence_data.

Solution

cd sample_sequence_data

There is one folder called sample_sequence_data.

Question 5

Request an interactive session with defaults.

Solution

sinteractive

Question 6

Load the package seqkit

Solution

module load seqkit

Question 7

How many sequences are in the file HBR_1_R1.fq?

Solution

seqkit stats HBR_1_R1.fq
file         format  type  num_seqs     sum_len  min_len  avg_len  max_len
HBR_1_R1.fq  FASTQ   DNA    118,571  11,857,100      100      100      100

Question 8

Is there an application called salmon installed on Biowulf?

Solution

module avail salmon
salmon/1.7.0    salmon/1.10.0    salmon/1.10.1 (D)    salmonte/0.4

  Where:
   D:  Default Module

Module defaults are chosen based on Find First Rules due to Name/Version/Version modules found in the module tree.
See https://lmod.readthedocs.io/en/latest/060_locating.html for details.

If the avail list is too long consider trying:

"module --default avail" or "ml -d av" to just list the default modules.
"module overview" or "ml ov" to display the number of modules for each name.

Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".

Question 9

What does the package salmon do?

Solution

module whatis salmon
salmon/1.10.1       : Estimating transcript-level expression from RNA-seq with quasi-mapping
salmon/1.10.1       : URL =>  http://combine-lab.github.io/salmon

Question 10

Sign on to Helix and download the first 1000 sequences for SRA SRR27044741. This was sequence in pair end mode.

ssh username@helix.nih.gov
module load sratoolkit
fastq-dump --split-files -X 1000 SRR27044741