Skip to content

Lesson 1: Short introduction to Python, signing onto Biowulf, and starting Jupyter Lab

Learning objectives

After this lesson, participants will

  • Be able to describe Python and provide rationale for using it
  • Know how to start a Jupyter Lab session on Biowulf (Jupyter Lab will be used to interact with Python throughout this course)
  • Be familiar with places for getting Python packages
  • Become familiar with navigating the Jupyter Lab environment
  • Be able to describe Python command syntax
  • Know how to find help for Python commands
  • Become familiar with continuing and self-learning resources

What is Python and why use it?

  • Scripting language
    • Facilitates reuse and reproducibility
  • Can be used to analyze large datasets
  • Extensive external packages that can be used for
    • Data wrangling
    • Data visualization
    • Single cell RNA sequencing analysis
    • Working with biological sequences
    • Interfacing with bioinformatics databases
  • Strong support community
  • Easy to learn

Note

Python packages can be found at The Python Package Index.

Signing onto Biowulf

In this course series, participants will interact with Python through Jupyter Lab on Biowulf. Thus, the first step is to sign onto Biowulf using ssh. Replace username with participant's own Biowulf username.

ssh username@biowulf.nih.gov
  • Mac: use ssh through the Terminal
  • Windows: use ssh through the command prompt

Change into Biowulf data directory

Use cd to change into the participant's data directory on Biowulf. Again, replace username with participant's Biowulf username.

cd /data/username

Request an interactive session

Request an interactive session using sinteractive with the following options.

  • --gres=lscratch:5: to allocate 5gb of local temporary/scratch storage space
  • --mem=2gb: to request 2gb of memory or RAM
  • --tunnel: to open up a channel of communication between local machine and Biowulf to allow interaction with applications like Jupyter Lab
sinteractive --gres=lscratch:5 --mem=2g --tunnel

After resources for the interactive session has been granted, users will see the information similar to that shown in Figure 1.

Figure 1: After interactive session resources have been allocated, users will see a ssh command that looks like that enclosed in the red rectangle. Open a new terminal (if working on a Mac) or command prompt (if working on a Windows computer) and then copy and paste this ssh command into the new terminal.

After copying and pasting the ssh command shown in Figure 1 to a new terminal or command prompt, hit enter to supply password and log in to Biowulf. This will complete the tunnel.

Figure 2: Hit enter after copying and pasting the ssh command to a new terminal to provide password and log into Biowulf. This will complete the tunnel.

Figure 3: In the ssh command shown in Figure 1 and Figure 2, the numbers preceding and following "localhost" will differ depending on user. Also, the Biowulf username will differ for each user (wuz8 is the instructor's Biowulf username).

Load Jupyter

After the tunnel has been created, go back terminal (Mac) or command prompt (Windows) with the Biowulf interactive session and activate Jupyter (see Figure 4).

module load jupyter

Figure 4: Go back to the terminal (Mac) or command prompt (Windows) with the interactive session (look for cn#### at the prompt). Do module load jupyter from here.

Start Jupyter Lab

Use the command below to start a Jupyter Lab session. Copy and paste either of the http links to a local browser to interact with Jupyter (see Figure 5).

jupyter lab --ip localhost --port $PORT1 --no-browser

Figure 5: Start a Jupyter lab session using jupyter lab --ip localhost --port $PORT1 --no-browser and copy and paste either one of the http links to a local browser.

Warning

The URLs change with each Jupyter Lab session, so please do not copy from the examples shown below. Copy from the URLs provided in the Biowulf interactive session terminal instead.

Jupyter Lab - file explorer and launcher

  • File explorer
  • Launcher for starting language specific notebooks (for this course series, choose the python/3.10 notebook)

Jupyter Lab

Jupyter Notebook - cells

Jupyter Notebook

Python education resources

Visit the self learning resources page to request a Dataquest or Coursera license.

Python command syntax

The command syntax for Python is composed of the

  • Command
  • Argument, which is enclosed in the parentheses and what the command will act on
  • Options, which is enclosed in parentheses and alters the way the command runs
command(argument, options)

Example of a Python command with and without options

print("Hello", "welcome to Python")
Hello welcome to Python

Include option sep to place a comma between "Hello" and "welcome to Python".

print("Hello", "welcome to Python", sep=", ")
Hello, welcome to Python

Finding help for Python commands

The help command can be used to view documentations for Python commands. It follows the Python command syntax. Insert the command in which help is needed into the parentheses.

help()

Example of using help

help(print)
Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)

    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.

Copy class data to data directory

The example datasets used for this course series reside in /data/classes/BTEP/pies_2023_data. Make a copy in your data directory.

cp -r /data/classes/BTEP/pies_2023_data ./pies_2023