Unix for Bioinformatics Beginners

by Joe Wu, PhD (BTEP)

Background on Unix

  • A computer operating system like Windows and MacOS.
  • Interact with computer through a terminal and by issuing commands in the terminal (so no graphical user interface or GUI).
  • There are many Unix like operating system such as Linux. Even MacOS is based on Unix.

Why learn Unix

  • Efficiently process large datasets.
  • Bioinformatics/NGS analysis software are written to work on Unix and Unix-like systems.
  • Bioinformatics analyses often require researchers to work on high performance computing systems such as Biowulf at NIH, which requires command line knowledge.
  • Write scripts to ensure reproducibility and reusability.

Background on Biowulf

  • Biowulf is the Unix-based high performance computing system at NIH.
  • Has around 1000 scientific applications (including those for NGS analysis) installed.
  • Biowulf staff maintain the system and updates software.

Learning objectives

  • This class is meant for the novices.
  • Will not make participants experts but materials learn form the basis for learning advanced Unix skills.
  • After this class, participants should be familiar with Unix commands needed to get started with bioinformatics including:
    • Signing onto Biowulf
    • Navigating through Biowulf’s directories
    • Working with files