ncibtep@nih.gov

Bioinformatics Training and Education Program

Integrated Development Environments (IDEs)

IDEs are indispensable applications for the bioinformatics beginner as well as most computer programmers, software developers, and web developers. But, what are IDEs? What IDEs are commonly seen in bioinformatics, and what should you consider when selecting one?

An IDE is an Integrated Development Environment. IDEs are applications for writing, editing, running, and debugging code. Built-in features may include a console, file access, an environment / variable view, data view, plotting window, access to code history, autocomplete (intellisense) features, AI integration, debugging tools, and markdown rendering capabilities. Using an IDE makes coding easier, increases productivity, and facilitates project management.

There are many different IDEs, and they vary in their features and uses (e.g, specialized vs multi-language, full-featured vs lightweight, online vs offline, free vs licensed, etc.). In the context of bioinformatics, you have likely heard of RStudio, VS Code, and JupyterLab. These are all great choices depending on what you are doing. Note, a simple google search will churn up many additional options. Feel free to research these furthers.

RStudio

RStudio is an IDE for R, and now python. RStudio includes a console, editor, and tools for plotting, history, debugging, and work space management. It provides a graphic user interface for working with R, thereby making R more user friendly. RStudio is open-source and can be installed locally or used through a browser (RStudio Server or Posit Cloud); there are also licensed versions with additional features. RStudio was developed specifically with R in mind, and is therefore, a fantastic choice if you plan to primarily use R programming.

Posit, the developer of RStudio, has recently released Positron. Positron is basically a fork of VS Code pre-configured for use with python and R. If you are interested in both R and Python, this is likely a good choice, as it has better integration with both languages than RStudio alone.

VS Code

Visual Studio Code (VS Code) is a free, open-source, light-weight, extensible code editor developed by Microsoft. Multi-language support and other features are primarily available through installable extensions, making this a fantastic tool for collaborative and complex projects spanning multiple programming languages. However, because VS Code is very generalizable, it will likely not be as great as an IDE developed for any one specific language. VS Code is a great choice for someone looking for a career in bionformatics or software development and may not be as useful for someone learning bioinformatics simply to analyze their own data.

JupyterLab

JupyterLab is a web-based IDE for notebooks, code, and data. Like VS Code, it allows users to incorporate new features and functionality via extensions. Though, these are more limited than VS Code. While JupyterLab does have multi-language capabilities, it was designed and optimized for work with python (and Juptyer Notebooks) and shines in this regard. It is particularly useful for data science applications. For more complex python projects, users may prefer more heavy-weight IDEs designed specifically for python (e.g., PyCharm).

These are just three of many options available to programmers and researchers interested in data science and bioinformatics. Of course, you can use multiple IDEs depending on your experience and goals. Do you need multi-language support, features for collaboration, and AI integration? If so, make sure the IDE you choose includes such features. Ultimately, you should use whatever application helps you get the job done. Note that if you intend to use an HPC for bioinformatics, all of these options are available through NIH HPC Open OnDemand. If you have questions, please contact us at ncibtep@nih.gov.

– Alex Emmons (BTEP)