Lesson 4: Practice questions
For these practice questions, check the present working directory and if needed, change into the /data/username folder (username is the student account ID).
Question 1
Copy the lesson4_practice folder from /data/classes/BTEP/unix_on_biowulf_2024_practice_sessions to the present working directory, which should be /data/username.
Solution
cp -r /data/classes/BTEP/unix_on_biowulf_2024_practice_sessions/lesson4_practice .
Question 2
Change into the lesson4_practice folder.
Solution
cd lesson4_practice
Question 3
How many folders are in this directory and what is the name of this folder?
Solution
ls -l
There is one folder called sample_sequence_data.
Question 4
Change into the folder sample_sequence_data.
Solution
cd sample_sequence_data
There is one folder called sample_sequence_data.
Question 5
Request an interactive session with defaults.
Solution
sinteractive
Question 6
Load the package seqkit
Solution
module load seqkit
Question 7
How many sequences are in the file HBR_1_R1.fq?
Solution
seqkit stats HBR_1_R1.fq
file format type num_seqs sum_len min_len avg_len max_len
HBR_1_R1.fq FASTQ DNA 118,571 11,857,100 100 100 100
Question 8
Is there an application called salmon installed on Biowulf?
Solution
module avail salmon
salmon/1.7.0 salmon/1.10.0 salmon/1.10.1 (D) salmonte/0.4
Where:
D: Default Module
Module defaults are chosen based on Find First Rules due to Name/Version/Version modules found in the module tree.
See https://lmod.readthedocs.io/en/latest/060_locating.html for details.
If the avail list is too long consider trying:
"module --default avail" or "ml -d av" to just list the default modules.
"module overview" or "ml ov" to display the number of modules for each name.
Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
Question 9
What does the package salmon do?
Solution
module whatis salmon
salmon/1.10.1 : Estimating transcript-level expression from RNA-seq with quasi-mapping
salmon/1.10.1 : URL => http://combine-lab.github.io/salmon
Question 10
Sign on to Helix and download the first 1000 sequences for SRA SRR27044741. This was sequence in pair end mode.
ssh username@helix.nih.gov
module load sratoolkit
fastq-dump --split-files -X 1000 SRR27044741