Skip to content

Lesson 4: Help session

Lesson recap

In this lesson, we learned how to move files from one directory to another, rename files and folders. We also learned how to remove directories and use the rm command so that it confirms with us before removing.

Practice questions

For these exercises, copy the lesson4_practice_20230202 folder in the BTEP classes directory to your data directory by following the steps below.

First, sign into Biowulf

ssh username@biowulf.nih.gov

After connecting to Biowulf, change into your data directory

cd /data/username

Then, copy over the lesson4_practice_20230202 folder to your data directory (which should be your present working directory - denoted by ".")

cp -r /data/classes/BTEP/unix_on_biowulf_2023_practice_sessions/lesson4_practice_20230202/ .

Question 1:

Change into the lesson4_practice_20230202 folder after you copied it to your data directory. What is in this folder? How many subfolders does it have?

Solution

cd lesson4_practice_20230202/
ls -l

The lesson4_practice_20230202 folder has one subfolder (example_rna_seq_counts)

total 1
drwxr-s---. 2 student1 student1 4096 Jan 17 10:41 example_rna_seq_counts

Question 2:

Change into the example_rna_seq_counts folder. How many files are in this folder?

Solutions

cd example_rna_seq_counts/
ls -l

There are three files in the example_rna_seq_counts folder. These are gene expression counts tables for NCBI SRA studies SRP025982, SRP048685, and SRP011233 obtained from recount2, which is a repository of RNA sequencing count tables.

total 360960
-rwxr-x---. 1 student1 student1 366932592 Jan 17 10:41 counts_gene_1.tsv
-rwxr-x---. 1 student1 student1    883910 Jan 17 10:41 counts_gene_2.tsv
-rwxr-x---. 1 student1 student1   1590821 Jan 17 10:41 counts_gene_3.tsv

Question 3:

Unfortunately, file names for gene expression counts data from recount2 are very generic (like what we see in the example_rna_seq_counts directory). Thus, we would like to rename these files to make them more informative. So knowing that

  • counts_gene_1.tsv is derived from study SRP025982
  • counts_gene_2.tsv is derived from study SRP048685
  • counts_gene_3.tsv is derived from study SRP011233

Can you rename the files counts_gene_1.tsv to SRP025982.tsv, counts_gene_2.tsv to SRP048685.tsv, and counts_gene_3.tsv to SRP011233.tsv so that the file names tells us the study in which they came from?

Solution

mv counts_gene_1.tsv SRP025982.tsv
mv counts_gene_2.tsv SRP048685.tsv
mv counts_gene_3.tsv SRP011233.tsv

Question 4:

Go back up one directory to the lesson4_practice_20230202 folder and make a new folder called rna_seq_recounts and then move the expression counts tables into this folder

Solution

mkdir rna_seq_recounts
mv example_rna_seq_counts/SRP025982.tsv rna_seq_recounts
mv example_rna_seq_counts/SRP048685.tsv rna_seq_recounts
mv example_rna_seq_counts/SRP011233.tsv rna_seq_recounts

Question 5:

Change into the rna_seq_recounts folder and delete SRP025982.tsv, but make sure that we are asked whether we really want to delete.

Solution

cd rna_seq_recounts
rm -i SRP025982.tsv

Question 6:

The example_rna_seq_counts folder in the lesson4_practice_20230202 directory should now be empty. How do you remove an empty directory in Unix.

Solution

Go back up one directory to lesson4_practice_20230202

cd ..
rmdir example_rna_seq_counts