Course Overview
Biowulf is the Unix-based high-performance compute cluster at NIH and houses over 900 softwares including those used for bioinformatics analysis. While most are used to working with point-and-click operating systems such as Windows or Mac, working in a command-line driven environment such as Biowulf can be intimidating. This course series will help participant overcome fear of working on high-performance computing clusters so that they can start taking advantage of the resources available for their bioinformatics and data science needs.
Course Expectations / Learning Objectives
After this course, participants will be able to
- Log onto the NIH High Performance Compute Cluster known as Biowulf
- Navigate the folder and file (directory) structure on a Unix system
- Work with very large Next Generation Sequencing (NGS) files on a Unix system
- Find and load bioinformatics applications that are installed on Biowulf
- Run interactive, swarm and batch jobs on Biowulf
Course schedule and topical outline
- Lesson 1 (September 7th, 2023):
- Overview of Unix and Biowulf
- Logging into Biowulf
- Lesson 1 recording
- Lesson 2 (September 14th, 2023):
- Navigating around the Biowulf directory structure
- Lesson 2 recording
- Lesson 3 (September 21st, 2023):
- Working with files and directories
- Interactive sessions
- Exploring Next Generation Sequencing data
- Lesson 3 recording
- Lesson 4 (September 28th, 2023):
- Bioinformatics applications on Biowulf
- Submitting batch jobs
- Swarm
- Shell script
- Lesson 4 recording