ncibtep@nih.gov

Bioinformatics Training and Education Program

Getting Started with a NIH Anaconda Business License

Anaconda (https://www.anaconda.com/) is a package manager and distributor for a wide array of data science software. Package managers are ideal for scientists who conduct analysis on a personal computer as they eliminate or reduce trouble caused by versioning, dependencies, and security during software installation.

NIH scientists have access to Anaconda Business and should use this rather than the free tier of Anaconda. Anaconda Business enables researchers to access Python, R, Jupyter Lab/Notebok as well as over 7500 Python and R packages (source: https://nih.sharepoint.com/sites/CIT-ApplicationRepository/SitePages/Anaconda.aspx). Users can also review any potential security issues for each software prior to installation. Below are examples of Python and R packages that can be found on Anaconda.

R packages:

  • Bioconductor, a repository of tools used for biological computing including DESeq2 and edgeR for bulk RNA sequencing differential expression analysis.
  • Seurat for analyzing single cell RNA sequencing data.
  • Tidyverse, a collection of tools for data wrangling and visualization.
  • Patchwork, a package for constructing multiple panel plots.

Python packages:

  • Pandas, which is used for working with and wrangling tabular data.
  • Seaborn, a popular package for creating data visualizations.
  • Biopython, a collection of software for molecular biology computing including sequence alignment.

Follow the steps below to get started with Anaconda Business.

  1. Submit a request at https://forms.office.com/g/CArrnuE4cD to join the NIH Anaconda Business license.
  2. Download the latest Anaconda Navigator at https://www.anaconda.com/download. Users should supply their NIH email prior to download. Windows users may need to contact their institutional computing help desk to install (https://service.cancer.gov/ncisp for NCI affiliates). The latest Python version comes with the Anaconda Navigator (Figure 1).
  3. Once the NIH Anaconda Business license has been approved, users will receive an email containing an authentication token, which will be used to access software available through the NIH license.

Figure 1: The Anaconda Navigator provides quick access to tools such as Jupyter Lab and enables scientists to manage Conda environments and software on local computer.

Alternatively, scientists can install the command line version of Anaconda known as Miniconda (https://www.anaconda.com/docs/getting-started/miniconda/install). Miniconda includes Python but does not come with the navigator like the full version of Anaconda. Users will also need to separately install tools such as Jupyter Lab and can use the authentication token to access features that are available through the NIH license.

After installing either the Anaconda Navigator or Miniconda, users may want to consider the following tips to make analysis easier.

  • In case Jupyter Lab does not open in a browser upon clicking on “Launch” from the Anaconda Navigator do the following:
    • At the command line, type `jupyter lab –generate-config` to generate a file named py that is written to the jupyter folder in the user’s computer home directory. Open jupyter_lab_config.py and set the following to “True”.
      • c.ExtensionApp.open_browser=True
      • c.LabServerApp.open_browser=True
      • c.LabApp.expose_app_in_browser=True
      • c.LabApp.open_browser=True
      • c.ServerApp.open_browser=True
      • c.ServerApp.browser =preferred_web_browser (ie. Google Chrome)
  • See https://docs.anaconda.com/working-with-conda/packages/using-r-language/ for details on installing R in Anaconda. Users will need to create an R environment prior to installing R and R packages. Users should install the current version of R Studio at https://posit.co/download/rstudio-desktop/. To use R Studio with R installed via Anaconda do the following:
    1. Open a Terminal (Mac) or Anaconda Command Prompt (Windows).
    2. At the command prompt, type conda activate followed by the name of the environment in which R was installed. For instance, if R was installed in an environment called r_environment, then the command would be conda activate r_environment.
    3. Finally:
      • For Macs, type `open -a rstudio` to open R Studio.
      • For Windows, type `open rstudio` to open R Studio.

Questions related to Anaconda can be sent to NIHAnaconda@mail.nih.gov. Of course, BTEP will be happy to assist with questions as well, just send them to ncibtep@nih.gov.

–  Joe Wu, Ph.D. (BTEP)