Introduction to Unix and Biowulf

Joe Wu, PhD
NCI CCR Bioinformatics Training and Education Program
August 14, 2025
ncibtep@nih.gov

Background on Unix

  • A computer operating system like Windows and MacOS.
  • Interact with computer through a terminal and by issuing commands in the terminal (so no graphical user interface or GUI).
  • There are many Unix like operating system such as Linux. Even MacOS is based on Unix.

Why learn Unix

  • Efficiently process large datasets.
  • Bioinformatics/NGS analysis software are written to work on Unix and Unix-like systems.
  • Bioinformatics analyses often require researchers to work on high performance computing systems such as Biowulf at NIH, which requires command line knowledge.
  • Write scripts to ensure reproducibility and reusability.

Background on Biowulf

  • Biowulf is the Unix-based high performance computing system at NIH.
  • Has around 1000 scientific applications (including those for NGS analysis) installed.
  • Biowulf staff maintain the system and updates software.
  • Biowulf HPC OnDemand is a feature that enables users to access Biowulf and graphical applications such as R Studio and Jupyter Lab from a web browser.

Learning objectives

  • This class is meant for the novices.
  • Will not make participants experts but materials learn form the basis for learning advanced Unix skills.
  • After this class, participants should be familiar with Unix commands needed to get started with bioinformatics including:
    • Signing onto Biowulf
    • Navigating through Biowulf's directories
    • Working with files
    • Launching and working with software installed on Biowulf
    • Starting graphical packages on Biowulf HPC OnDemand