Lesson 4: Help session
Lesson recap
In this lesson, we learned how to move files from one directory to another, rename files and folders. We also learned how to remove directories and use the rm
command so that it confirms with us before removing.
Practice questions
For these exercises, copy the lesson4_practice_20230202 folder in the BTEP classes directory to your data directory by following the steps below.
First, sign into Biowulf
ssh username@biowulf.nih.gov
After connecting to Biowulf, change into your data directory
cd /data/username
Then, copy over the lesson4_practice_20230202 folder to your data directory (which should be your present working directory - denoted by ".")
cp -r /data/classes/BTEP/unix_on_biowulf_2023_practice_sessions/lesson4_practice_20230202/ .
Question 1:
Change into the lesson4_practice_20230202 folder after you copied it to your data directory. What is in this folder? How many subfolders does it have?
Solution
cd lesson4_practice_20230202/
ls -l
The lesson4_practice_20230202 folder has one subfolder (example_rna_seq_counts)
total 1
drwxr-s---. 2 student1 student1 4096 Jan 17 10:41 example_rna_seq_counts
Question 2:
Change into the example_rna_seq_counts folder. How many files are in this folder?
Solutions
cd example_rna_seq_counts/
ls -l
There are three files in the example_rna_seq_counts folder. These are gene expression counts tables for NCBI SRA studies SRP025982, SRP048685, and SRP011233 obtained from recount2, which is a repository of RNA sequencing count tables.
total 360960
-rwxr-x---. 1 student1 student1 366932592 Jan 17 10:41 counts_gene_1.tsv
-rwxr-x---. 1 student1 student1 883910 Jan 17 10:41 counts_gene_2.tsv
-rwxr-x---. 1 student1 student1 1590821 Jan 17 10:41 counts_gene_3.tsv
Question 3:
Unfortunately, file names for gene expression counts data from recount2 are very generic (like what we see in the example_rna_seq_counts directory). Thus, we would like to rename these files to make them more informative. So knowing that
- counts_gene_1.tsv is derived from study SRP025982
- counts_gene_2.tsv is derived from study SRP048685
- counts_gene_3.tsv is derived from study SRP011233
Can you rename the files counts_gene_1.tsv to SRP025982.tsv, counts_gene_2.tsv to SRP048685.tsv, and counts_gene_3.tsv to SRP011233.tsv so that the file names tells us the study in which they came from?
Solution
mv counts_gene_1.tsv SRP025982.tsv
mv counts_gene_2.tsv SRP048685.tsv
mv counts_gene_3.tsv SRP011233.tsv
Question 4:
Go back up one directory to the lesson4_practice_20230202 folder and make a new folder called rna_seq_recounts and then move the expression counts tables into this folder
Solution
mkdir rna_seq_recounts
mv example_rna_seq_counts/SRP025982.tsv rna_seq_recounts
mv example_rna_seq_counts/SRP048685.tsv rna_seq_recounts
mv example_rna_seq_counts/SRP011233.tsv rna_seq_recounts
Question 5:
Change into the rna_seq_recounts folder and delete SRP025982.tsv, but make sure that we are asked whether we really want to delete.
Solution
cd rna_seq_recounts
rm -i SRP025982.tsv
Question 6:
The example_rna_seq_counts folder in the lesson4_practice_20230202 directory should now be empty. How do you remove an empty directory in Unix.
Solution
Go back up one directory to lesson4_practice_20230202
cd ..
rmdir example_rna_seq_counts