R Crash Course

Modified

April 15, 2026

Accessing R

R can be accessed from the command line using R, which opens the R console, or it can be accessed via and Integrated development environment (IDE) (e.g., RStudio, VSCode, etc.). R commands can be submitted together in a script or interactively in a console.

You can install and use R locally or via an HPC such as the NIH HPC Biowulf.

R is case sensitive, so avoid typos, and space agnostic, meaning, for the most part, R does not care about spaces.

How to navigate directories

  • setwd() Set working directory (equivalent to cd)
  • getwd() Get working directory (equivalent to pwd)

Getting help

  • help() and ? “provide access to the documentation pages for R functions, data sets, and other objects”.

  • help.search() “allows for searching the help system for documentation matching a given character string in the (file) name, alias, title, concept or keyword entries (or any combination thereof)”; equivalent to ??pattern

See more on getting help here.

Installing and loading packages

To take full advantage of R, you need to install R packages. R packages are loadable extensions that contain code, data, documentation, and tests in a standardized shareable format that can easily be installed by R users. The primary repository for R packages is the Comprehensive R Archive Network (CRAN). CRAN is a global network of servers that store identical versions of R code, packages, documentation, etc (cran.r-project.org).

An R library is, effectively, a directory of installed R packages which can be loaded and used within an R session. —renv

  • install.packages() install packages from CRAN
  • library() load packages in R session
  • BiocManager::install() install packages from Bioconductor
  • devtools::install_github() install an R package from Github.

.libPaths() reports the directory where your installed R packages are located.

Commenting

You can annotate your code by starting annotations with #. Comments to the right of # will be ignored by R.

Use # ---- to create navigable code sections.

For report generation, use R Markdown or Quarto.

Assignment operators

Anything that you want assigned to memory must be assigned to an R object.

<- the primary assignment operator, assigning values on the right to objects on the left.
= can also be used to assign values to objects, but is usually reserved for other purposes (e.g., function arguments)

Use ls() to list objects created in R. rm() can be used to remove an object from memory.

For R objects names,
* avoid spaces or special characters, excluding “_” and “.”.
* do not begin with numbers or underscores.
* do not use names with special meanings (?Reserved).

Object data types

The base data type (e.g., numeric, character, logical, etc.) and the class (dataframe, matrix, etc.) will be important for what you can do with an object. Learn more about an object with the following:

  • class() returns the class of an object or base data type
  • str() returns the structure of an object.
Notedplyr::glimpse()

Similar to str() but with much more succinct output.

Coercion is when converting from one type to another, which may throw various warning messages. Always make sure output matches expectations.

Importing and exporting data

Use the read functions to import data (e.g., read.csv, read.delim, etc.). Use write functions to export data (e.g., write.table).

There are specific functions for unique data. For example, we will learn how to specifically import scRNA-seq data using Seurat.

Using functions

R functions perform specific tasks. R has a ton of built-in functions and functions available through additional packages. You can also create your own functions.

The general syntax for a function is the name followed by parentheses, function_name() (e.g., round()).

To create a function:

function_name <- function(arg_1, arg_2, ...) {
   Function body 
}

Vectors

A vector is a collection of values that are all of the same type (numbers, characters, etc.) — datacarpentry.org

  • c() - used to combine elements of a vector

When you combine elements of different types in the same vector, they are forced into the same type via “coercion” (logical < numeric < character).
* length() - returns the number of elements in a vector

Use brackets to extract elements of a vector:

a <- 1:10
a[2]

Lists

Unlike vectors, lists can hold values of different types.

list(1, "apple", 3)

Data frames

Data frames hold tabular data comprised of rows and columns; they can be created using data.frame().

To understand more about the structure of an object and data frame, consider the following functions:

  • str() displays the structure of an object, not just data frames
  • dplyr::glimpse()similar to str but applies to data frames and produces cleaner output
  • summary() produces result summaries of the results of various model fitting functions
  • ncol() returns number of columns in data frame
  • nrow() returns number of rows of data frame
  • dim() returns row and column numbers
  • unique() returns a vector of with duplicates removed; also see dplyr::distinct()

We can subset data frames using bracket notation df[row,column]:

df<- data.frame(Counts=seq(1,5), animals=c("racoon","squirrel","bird","dog","cat"))
#to return just the animals column  
df[,"animals"]

We can also use functions from dplyr such as filter() for subsetting by row and select() for subsetting by column.

Plotting

There are 3 primary plotting systems with R: base R, ggplot2, and lattice. Data visualization functions from Seurat primarily use ggplot2 and can easily be customized by adding additional ggplot2 layers.

Check out the R Graph Gallery for data visualization examples and code.

Conditionals and Looping

See the attached resources on

Getting info on R Session

sessionInfo() Print version information about R, the OS and attached or loaded packages. This is useful for reporting methods for publication. Consider using the package renv to track and share exact versions of packages used for any given R script / project.