Skip to content

Biostars on Biowulf

To complement this course, there is a module available on Biowulf with installed programs associated with the Biostar Handbook. During class, we will work on the command line on the GOLD system on DNAnexus. However, this system will not be available outside of class time. There are two options for catching up on class work or working on practice problems outside of class. (1) You can install a Biostars conda environment on your local computer (See Biostar Handbook for instructions). (2) You can use the Biowulf HPC cluster and the Biostars Biowulf module. For option 2, you will need to obtain a Biowulf account.

How to use the Biostars Biowulf module

Loading the module

  1. Access the NIH network; this will require you to VPN if off campus.
  2. Connect to Biowulf

    ssh user_name@biowulf.nih.gov where user_name is your NIH username.

  3. Use sinteractive to work on an interactive node. This will result in 4GB of memory and 2 CPUs. Alternatively, you may submit jobs by making scripts and submitting jobs via sbatch (See lesson 5).

    Note: If you are planning to use the sratoolkit to download data from the SRA, you will need to allocate local scratch space (sinteractive --gres=lscratch:30).

  4. If using an interactive node, run the following

    source /data/classes/BTEP/apps/biostars/1.0/run_biostars.sh  
    
    This will do the following:

    1. Runs a set of commands to setup the terminal environment.
    2. Creates a data directory environmental variable ($DATA) where you can gain access to course data.
    3. Runs module use /data/classes/BTEP/apps/modules
    4. Runs module load biostars
  5. If you want to use the biostars module for other purposes or you want to submit jobs via sbatch, skip Step 4. You can load the module with the following:

    1. module use /data/classes/BTEP/apps/modules
    2. module load biostars
    3. module help biostars

Course files found on DNAnexus will be made accessible on Biowulf. The path to course files has been assigned to the environment variable $DATA, which is automatically set when you run Step 4.

How to transfer data to and from Biowulf?

There is extensive documentation on transferring data to and from Biowulf at hpc.nih.gov. If you simply would like to view .html files or other output, or you want to copy small files, consider mounting HPC system directories to your local computer.