Introduction to Unix on Biowulf 2023
Welcome to this introductory course series on working with Unix on Biowulf. Biowulf is the high performance compute cluster at NIH and runs Unix, which is a command driven operating system. While most are used to working with graphical driven operating systems such as Windows or Mac, working in a completely text and command driven environment can be a daunting task. In addition, most are not used to working in a high-performance computing system where there are lots of computing resources that are shared among many users, so there are some etiquettes that users should follow. In this course series, we will walk through the basics of working in Unix command line on Biowulf. Skills learned in this course will be particularly useful as they are essential to performing bioinformatics work.
Below is an outline of topics covered in this course series. We will meet Tuesdays and Thursdays from January 24 until February 14 between 1 - 2 pm followed by an optional help session from 2 - 3 pm.
Lesson 1 (January 24, 2023) (Recording)
- Quick overview of Unix and Biowulf
- Discuss Biowulf accounts
- Use of student accounts
- Use of personal account if registrant has one already
- Signing onto Biowulf
Lesson 2 (January 26, 2023) (Recording)
- Overview of the Biowulf environment
- Login node
- Different directory/data storage spaces – home, data, scratch
- Unix directory path structure
- Getting help with Unix commands
- Navigating the Unix file systems (changing directories)
- Listing directory content
Lesson 3 (January 31, 2023) (Recording)
- Copying content from one directory to another
- File and directory permissions
- Modifying file and directory permissions
- Continue to learn to navigate the Unix file system and learn to
- List directory content
- Remove files
Lesson 4 (February 2, 2023) (Recording)
- Working with files and directories in Unix
- Moving
- Renaming
- More in-depth coverage of removing files and directories
Lesson 5 (February 9, 2023) (Recording)
- Working with an interactive session on Biowulf
- Modules and applications installed on Biowulf
- View available applications
- Load/unload applications
- Change application version used
- Example of using bioinformatics application Biowulf, which includes
- Sratoolkit for downloading sequencing data from NCBI SRA
- FASTQC for assessing sequencing data quality
Lesson 6 (February 14, 2023) (Recording)
- Creating simple shell script that downloads data from SRA and runs fastqc on the data downloaded – will submit this a batch job on Biowulf
- Transfer data between Biowulf and local machine – using the html based fastqc reports generated as an example
Lesson 7 (February 16, 2023) (Recording)
- Data wrangling in Unix
- Downloading data from the web
- Viewing of files
- Pattern searching
Student account assignment
For those using student accounts, please click below to view assignment.