Creating and modifying scatter plots: PCA and Volcano
Objectives
-
Learn how to make and modify scatter plots using ggplot2.
-
Learn how to visualize PCA results.
-
Learn how to create a Volcano plot.
Load the libraries
library(ggplot2)
library(dplyr)
library(DESeq2)
Other R packages used in this tutorial include patchwork
, ggfortify
, EnhancedVolcano
, and ggrepel
.
What is ggplot2?
ggplot2
is a popular R graphics package associated with a family of packages known as the tidyverse. Tidyverse packages work effectively on data stored in data frames (or tibbles), which store variables in columns and observations in rows. In ggplot2, plots are built in layers, allowing one to incorporate multiple data sets and advanced features, thus allowing the generation of fairly complex plots.
To build a plot, we need:
- data to plot
- one or more geoms - the visual representation of the plot.
- mapping aesthetics - describe how the variables are mapped to the geoms
We can also include additional parameters:
- facets - subplots
- coordinate systems (default is Cartesian)
- plot layout and rendering options (customize legends, lines, text, and other features).
- statistical transformations / summarizations
The basic ggplot2 template:
ggplot(data = <DATA>) +
<GEOM_FUNCTION>(
mapping = aes(<MAPPINGS>),
stat = <STAT>
) +
<FACET_FUNCTION> +
<COORDINATE SYSTEM> +
<THEME>
Example data
Here we will use bulk RNA-Seq data available in the R package airway
, which is from an experiment published by Himes et al. (2014). These data, which are available in R as a RangedSummarizedExperiment
object, are from a bulk RNAseq experiment. In the experiment, the authors "characterized transcriptomic changes in four primary human ASM cell lines that were treated with dexamethasone," a common therapy for asthma. The airway package includes RNAseq count data from 8 airway smooth muscle cell samples. Each cell line includes a treated and untreated negative control.
We will use some normalized count data (rlog) and differential expression results processed according to the Bioconductor workflow, rnaseqGene
. The normalized data is stored within a DESeqTransform
object, and differential expression results are provided in a comma separated file. The data used in this tutorial is available for download here. See sessionInfo()
at the end of this tutorial for information related to package versions.
#load R object
air<- readRDS("./Data/normalized_air.rds")
#read differential expression results
dexp<- read.csv("./Data/deseq2_DEGs.csv",row.names=1)
dexp<- dexp %>% filter(!is.na(padj))
Scatter plots
Scatterplots are useful for visualizing treatment–response comparisons, associations between variables, or paired data (e.g., a disease biomarker in several patients before and after treatment). - Holmes and Huber, 2021
Because scatter plots involve mapping each data point, the geom function used is geom_point()
.
Let's check out the basic scatter. We can look at the relationship of our normalized count data from all genes between samples.
ggplot(data=assay(air))+ #DATA
geom_point(aes(x=SRR1039508,y=SRR1039512)) + #what's on x and y axis
coord_fixed(xlim=c(0,20),ylim=c(0,20))
Here, we include the data we want to plot (assay(air)
), a geom function to represent the plot (geom_point
for a scatter plot), and mapping aesthetics assigning the x-axis to the genes from one sample (SRR1039508
) and the y-axis to the genes from another sample (SRR1039512
).
Note
The loaded gene counts include normalized data via a regularized-logarithm transformation (rlog). Read more on this here. This transformation effectively stabilizes the variance across the mean and is recommended for smaller data sets.
Common scatter plots used in genomics (PCA and Volcano)
- PCA is mostly used in genomics for dimensionality reduction and uncovering patterns in highly dimensional data, including identifying technical and biological variation.
Examples of applications in genomics:
- RNA-Seq
- Batch correction.
- Identifying Co-expressed Genes.
- scRNA-Seq
- Dimensionality reduction.
- Evolutionary Genomics
- Examining population structure.
- Examining relationships between species.
Visualizing PCA also includes other types of plots
What we think of as a PCA plot, is usually a scatter plot of PC1 vs PC2 (sometimes called a biplot). However, PCA is multidimensional, and we may want to visualize other aspects (e.g., other PCs, degree of variance by PC (Scree plots), scores vs loadings (biplot)). For this, check out factoextra or PCAtools
.
- Volcano plots are used to visualize statistical significance versus magnitude of change (fold change) between treatments or conditions in large genomic data sets (e.g., RNA-Seq, ChIP-Seq). Volcano plots are great for identifying genes or other features for further exploration.
Note
There are many other scatter plots used in genomics. We are simply focusing on PCA and volcano in this tutorial.
What is PCA?
Principal component analysis (PCA) is an exploratory linear dimension reduction method applied to highly dimensional (multivariate) data. It is an usupervised learning technique that treats all variables equally. The goal of PCA is to reduce the dimensionality of the data by transforming the data in a way that maximizes the variance explained. Read more here and here.
There is an assumption that variables are highly correlated.
How does it work?
- Data standardization - variables should be on the same scale (e.g., z-score transformation)
- Calculate covariance- create a covariance matrix to understand the relationship between variables
- Calculate eigenvectors (direction of variation) and eigenvalues (amount of variance) from the covariance matrix
- Select the top k principal components
- Transform the original data - obtain a representation of the data in the newly transformed space
Key points:
- The PCs represent a linear combination of the original variables
- PCs are uncorrelated
- The number of PCs is equivalent to the number of original variables in your data set.
- PC1 accounts for the most variance, and each subsequent PC will explain less and less variance.
For a more detailed explanation of PCA, see the following:
- https://benthamopen.com/contents/pdf/TOBIOIJ/TOBIOIJ-7-19.pdf
- https://stats.stackexchange.com/questions/2691/making-sense-of-principal-component-analysis-eigenvectors-eigenvalues
- https://www.billconnelly.net/?p=697
- https://builtin.com/data-science/step-step-explanation-principal-component-analysis
Perform PCA
There are many packages and functions available in R programming for performing PCA. Some of the most popular functions are stats::prcomp()
, stats::princomp()
, FactoMineR::PCA()
, and ade4::dudi.pca()
. These functions largely differ in the method used for PCA and the default arguments; see here. We will focus on stats::prcomp()
for this tutorial. If you are using an alternative function, make sure you are aware of the function arguments (and defaults).
PCA is used frequently in -omics fields. Often than not, there will be package specific functions for PCA and plotting PCA for different -omics analyses.
We can use the function prcomp()
to run PCA on our rlog transformed count data. This function requires numeric data. Notice by default, the function does not scale the data. If you have not transformed your data and you are working with variables of different units, consider scaling the data.
Why is scaling important?
Scaling and centering typically occur prior to PCA. Scaling removes biases from variables with high variances. Centering shifts the data to the origin by subtracting the mean of each feature. This is important for finding linear subspaces and removing the affect of the mean.
Large variance in observed variable will contribute most to the overall variance of computed principal component. If the observed values of variables have very different ranges, then the data needs to be normalized/scaled. - Aniket Patil
Setting a scaling parameter to TRUE will standardize the data (i.e., perform a z-score transformation) so that the data has a mean of 0 and standard deviation of 1.
Now we run prcomp()
:
#run PCA
pca<-prcomp(t(assay(air)))
#get structure of df
str(pca)
List of 5
$ sdev : num [1:8] 23.28 17.78 14.84 12.08 7.12 ...
$ rotation: num [1:16139, 1:8] -0.00677 0.00359 0.00056 -0.00108 0.01033 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:16139] "ENSG00000000003" "ENSG00000000419" "ENSG00000000457" "ENSG00000000460" ...
.. ..$ : chr [1:8] "PC1" "PC2" "PC3" "PC4" ...
$ center : Named num [1:16139] 9.45 9.02 7.89 5.83 12.45 ...
..- attr(*, "names")= chr [1:16139] "ENSG00000000003" "ENSG00000000419" "ENSG00000000457" "ENSG00000000460" ...
$ scale : logi FALSE
$ x : num [1:8, 1:8] -25.8 14.1 -17.5 26.8 -22.4 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:8] "SRR1039508" "SRR1039509" "SRR1039512" "SRR1039513" ...
.. ..$ : chr [1:8] "PC1" "PC2" "PC3" "PC4" ...
- attr(*, "class")= chr "prcomp"
The object pca
is a list of 5: the standard deviations of the principal components, a matrix of variable loadings, the centering and scaling parameters used, and the data projected on the principal components.
Plot PCA
To plot the first two axes of variation along with species information, we will need to make a data frame with this information. The axes are in pca$x
.
#Build a data frame
pcaData <- as.data.frame(pca$x[, 1:3]) # extract first three PCs
# add the sample information
pcaData<- data.frame(colData(air)) |> select(cell, dex, Run) |> cbind(pcaData)
# modify column names
colnames(pcaData) <- c("Cell","Treatment","Sample","PC1","PC2","PC3")
#Plot
ggplot(pcaData, aes(PC1, PC2, color = Treatment, shape = Cell)) +
geom_point(size = 2) + # adding data points
coord_fixed() # fixing coordinates
coord_fixed()
Tip 6, here, discusses the importance of aspect ratio and using coord_fixed()
more in detail.
This is a decent plot showing us how our RNA-Seq samples are related based on gene expression. PC1 always explains the most variation followed by each successive PC. From this plot, we see that our samples cluster by treatment along PC1, while PC2 is capturing variation in the cell lines.
In the plot, the axes are missing the % explained variance. Let's add custom axes. We can do this with the xlab()
and ylab()
functions or the functions labs()
. But first we need to grab some information from our PCA analysis. Let's use summary(pca)
. This function provides a summary of results for a variety of model fitting functions and methods.
#Extract % Variance Explained
summary(pca)
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7
Standard deviation 23.2778 17.7783 14.8386 12.0827 7.11804 5.66206 4.9081
Proportion of Variance 0.4071 0.2375 0.1654 0.1097 0.03807 0.02409 0.0181
Cumulative Proportion 0.4071 0.6446 0.8101 0.9197 0.95781 0.98190 1.0000
PC8
Standard deviation 1.09e-13
Proportion of Variance 0.00e+00
Cumulative Proportion 1.00e+00
PC1 and PC2 combined account for 65% of variance in the data. We can add this information directly to our plot using custom axes labels.
Add custom axes labels
#Plot
ggplot(pcaData, aes(PC1, PC2, color = Treatment, shape = Cell)) +
geom_point(size = 2) +
coord_fixed() +
xlab("PC1: 41%")+ #x axis label text
ylab("PC2: 24%") # y axis label text
Automating the "Proportion of Variance"
If you want to automate the "Proportion of Variance", you should call it directly in the code. For example,
ggplot(pcaData, aes(PC1, PC2, color = Treatment, shape = Cell)) +
geom_point(size = 2) +
coord_fixed() +
labs(x=paste0("PC1: ",round(summary(pca)$importance[2,1]*100),"%"),
y=paste0("PC2: ",round(summary(pca)$importance[2,2]*100),"%"))
Add PC3
To add multiple principal components, you can build multiple plots
a<-ggplot(pcaData, aes(PC1, PC2, color = Treatment, shape = Cell)) +
geom_point(size = 2) +
coord_fixed() +
labs(x=paste0("PC1: ",round(summary(pca)$importance[2,1]*100),"%"),
y=paste0("PC2: ",round(summary(pca)$importance[2,2]*100),"%"))
b<-ggplot(pcaData, aes(PC3, PC2, color = Treatment, shape = Cell)) +
geom_point(size = 2) +
coord_fixed() +
labs(x=paste0("PC3: ",round(summary(pca)$importance[2,3]*100),"%"),
y=paste0("PC2: ",round(summary(pca)$importance[2,2]*100),"%"))
c<-ggplot(pcaData, aes(PC1, PC3, color = Treatment, shape = Cell)) +
geom_point(size = 2) +
coord_fixed() +
labs(x=paste0("PC1: ",round(summary(pca)$importance[2,1]*100),"%"),
y=paste0("PC3: ",round(summary(pca)$importance[2,3]*100),"%"))+
theme(legend.position="none")
library(patchwork)
Warning: package 'patchwork' was built under R version 4.4.1
a + b + c + plot_layout(nrow=2,guides = 'collect')
A great package for simplifying this is PCAtools
, using the function pairsplot()
. This package has other functionality that also may be of interest, and it plays nicely with ggplot2
functions for customization.
Add a stat to our plot with stat_ellipse()
.
In scatter plots, the raw data is the focus of the plot, but for many other plots, this is not the case. You may wish to overlay a stat on your PCA. For example, ellipses are often added to PCA ordinations to emphasize group clustering with confidence intervals. By default, stat_ellipse()
uses the bivariate t distribution, but this can be modified. Let's add ellipses with 95% confidence intervals to our plot.
ggplot(pcaData, aes(PC1, PC2, color = Treatment, shape = Cell)) +
geom_point(size = 2) +
coord_fixed() +
labs(x=paste0("PC1: ",round(summary(pca)$importance[2,1]*100),"%"),
y=paste0("PC2: ",round(summary(pca)$importance[2,2]*100),"%"))+
stat_ellipse(geom="polygon",aes(group=Treatment),
level=0.95, alpha=0.2) #adding a stat
Warning: The following aesthetics were dropped during statistical transformation: shape.
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
Using ggfortify
ggfortify
is an excellent package to consider for easily generating PCA plots. ggfortify
provides "unified plotting tools for statistics commonly used, such as GLM, time series, PCA families, clustering and survival analysis. The package offers a single plotting interface for these analysis results and plots in a unified style using 'ggplot2'."
library(ggfortify)
autoplot(pca, data = colData(air), colour = 'dex',shape='cell',size=2) +
coord_fixed()
Since this is a ggplot2 object, this can easily be customized by adding ggplot2 customization layers.
Using DESeq2 function
DESeq2 has its own function for PCA analysis, plotPCA()
. If you check out the source code for this function (getMethod("plotPCA","DESeqTransform")
), you will see that the data is filtered prior to prcomp()
to only include the 500 top most variable genes by default.
If you are interesting in examining the PC loadings, these are difficult to plot using ggplot2
alone. I recommend PCAtools
or factoextra
.
Plot Customization: Using themes
To create a publication quality plot, you will need to make several modifications to your basic PCA biplot code. We have already seen how to modify the default coordinate system, how to add additional statistics (e.g., ellipses with confidence intervals), and how to modify the axes labels. There are many more features that can be customized to make this publishable or fit a desired style.Changing non-data elements (related to axes, titles subtitles, gridlines, legends, etc.) of our plot can be done with theme()
. GGplot2 has a definitive default style that falls under one of their precooked themes, theme_gray()
. theme_gray()
is one of eight complete themes provided by ggplot2.
We can also specify and build a theme within our plot code or develop a custom theme to be reused across multiple plots. The theme function is the bread and butter of plot customization. Check out ?ggplot2::theme()
for a list of available parameters. There are many.
Let's see how this works by changing the fonts and text sizes and dropping minor grid lines:
ggplot(pcaData, aes(PC1, PC2, color = Treatment, shape = Cell)) +
geom_point(size = 2) +
coord_fixed() +
labs(x=paste0("PC1: ",round(summary(pca)$importance[2,1]*100),"%"),
y=paste0("PC2: ",round(summary(pca)$importance[2,2]*100),"%"))+
stat_ellipse(geom="polygon",aes(group=Treatment), level=0.95, alpha=0.2)+
theme_bw() + #start with a custom theme
theme(axis.text=element_text(size=12,family="Times New Roman"),
axis.title = element_text(size=12,family="Times New Roman"),
legend.text = element_text(size=12,family="Times New Roman"),
legend.title = element_text(size=12,family="Times New Roman"),
panel.grid.minor = element_blank())
Warning: The following aesthetics were dropped during statistical transformation: shape.
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
You may want to establish a custom theme for reuse with a number of plots. See this great tutorial by Madeline Pickens for steps on how to do that.
A helpful color trick
When you have a lot of colors and you want to keep these colors consistent, you can use the following convenient functions to set a name attribute for a vector of colors.
Let's do this for our asthma treatments.
#defining colors
gcolors<-setNames(c("black","orange"),levels(pcaData$Treatment))
#Now plot
ggplot(pcaData, aes(PC1, PC2, color = Treatment, shape = Cell)) +
geom_point(size = 2) +
scale_color_manual(values=gcolors)+ #Adding the color argument
coord_fixed() +
labs(x=paste0("PC1: ",round(summary(pca)$importance[2,1]*100),"%"),
y=paste0("PC2: ",round(summary(pca)$importance[2,2]*100),"%"))+
stat_ellipse(geom="polygon",aes(group=Treatment,fill=Treatment),
level=0.95, alpha=0.2)+
scale_fill_manual(values=gcolors)+
theme_bw() + #start with a custom theme
theme(axis.text=element_text(size=12,family="Times New Roman"),
axis.title = element_text(size=12,family="Times New Roman"),
legend.text = element_text(size=12,family="Times New Roman"),
legend.title = element_text(size=12,family="Times New Roman"),
panel.grid.minor = element_blank())
Warning: The following aesthetics were dropped during statistical transformation: shape.
ℹ This can happen when ggplot fails to infer the correct grouping structure in
the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
variable into a factor?
We can use this color palette for all plots including these two treatments to keep our figures consistent throughout a presentation or publication.
Creating a publication ready volcano plot
Now that we know how to create a PCA biplot, let's use what we have learned to also make a volcano plot.
A volcano plot is a type of scatterplot that shows statistical significance (P value) versus magnitude of change (fold change). It enables quick visual identification of genes with large fold changes that are also statistically significant. These may be the most biologically significant genes. --- Maria Doyle, 2021
First, we need to do a bit of data wrangling. The tidyverse
packages work great for this task.
#Wrangle the data
#add gene names to differential expression results
gene_names<- data.frame(rowData(air)) |> select(gene_name) |> tibble::rownames_to_column("ID")
dexp<- dexp |> tibble::rownames_to_column("ID") |>
left_join(gene_names, by="ID")
# Call genes significantly differentially expressed if they have a
#p-value less than 0.05 and a logFC greater than or equal to 2.
dexp_sigtrnsc <- dexp |>
mutate(Significant = padj < 0.05 & abs(log2FoldChange) >= 2) |>
arrange(padj)
#extract top 6 genes to use for the labels
topgenes <- dexp_sigtrnsc$gene_name[c(1:6)]
Let's take a quick look at the wrangled data and our top genes for later labeling:
head(dexp_sigtrnsc)
ID baseMean log2FoldChange lfcSE pvalue
1 ENSG00000152583 997.9612 4.555676 0.1858858 7.129668e-136
2 ENSG00000165995 495.5675 3.276456 0.1326038 5.859407e-136
3 ENSG00000120129 3411.9661 2.935137 0.1227877 2.414560e-128
4 ENSG00000101347 12712.9456 3.754493 0.1579042 7.501584e-128
5 ENSG00000189221 2343.5733 3.337353 0.1432593 1.140346e-122
6 ENSG00000211445 12298.1966 3.711558 0.1693331 5.201708e-110
padj gene_name Significant
1 5.530127e-132 SPARCL1 TRUE
2 5.530127e-132 CACNB2 TRUE
3 1.248569e-124 DUSP1 TRUE
4 2.909302e-124 SAMHD1 TRUE
5 3.538038e-119 MAOA TRUE
6 1.344901e-106 GPX3 TRUE
topgenes
[1] "SPARCL1" "CACNB2" "DUSP1" "SAMHD1" "MAOA" "GPX3"
Significant differential expression was assigned based on an absolute log fold change greater than or equal to 2 and an FDR corrected p-value less than 0.05.
Enhanced Volcano
There is a dedicated package for creating volcano plots available on Bioconductor, EnhancedVolcano
. Plots created using this package can be customized using ggplot2 functions and syntax.
Using EnhancedVolcano:
#The default cut-off for log2FC is >|2|
#the default cut-off for P value is 10e-6
library(EnhancedVolcano)
Loading required package: ggrepel
Warning: package 'ggrepel' was built under R version 4.4.1
EnhancedVolcano(dexp,
title = "Enhanced Volcano with Airways",
lab = dexp_sigtrnsc$gene_name,
x = 'log2FoldChange',
y = 'padj')
If you are interested in using this package, check out this tutorial.
Let's start our plot with the <DATA>
, <GEOM_FUNCTION>
, and <MAPPING>
. We do not need to fix the coordinate system because we are working with two different values on the x and y and we don't need any special coordinate system modifications. Let's plot logFC on the x axis and the mutated column with our adjusted p-values on the y-axis and set the significant p-values off from the non-significant by color. We can also go ahead and customize the color scale, since we have learned how to do that.
ggplot(data=dexp_sigtrnsc) +
geom_point(aes(x = log2FoldChange, y = log10(padj), color = Significant,
alpha = Significant)) +
scale_color_manual(values = c("black", "#e11f28"))
Warning: Using alpha for a discrete variable is not advised.
This is not exactly what we want so let's keep working.
Immediately, you should notice that the figure is upside down compared to what we would expect from a volcano plot. There are two possible ways to fix this. We could transform the adjusted p-values by multiplying by -1 OR we could work with our axes scales. Aside from text modifications, we haven't yet changed the scaling of the axes. Let's see how we can modify the scale of the y-axis.
Changing axes scales
To change the y axis scale, we will need a specific function. These functions generally start with scale_y...
. In our case we want to reverse our axis so that increasingly negative is going in the positive direction rather than the negative direction. Luckily, there is a function to reverse our axis; see ?scale_y_reverse()
.
ggplot(data=dexp_sigtrnsc) +
geom_point(aes(x = log2FoldChange, y = log10(padj), color = Significant,
alpha = Significant)) +
scale_color_manual(values = c("black", "#e11f28")) +
scale_y_reverse(limits=c(0,-150))
Warning: Using alpha for a discrete variable is not advised.
This looks pretty good, but we can tidy it up more by working with our legend guides and our theme.
Modifying legends
We can modify many aspects of the figure legend using the function guide()
. Let's see how that works and go ahead and customize some theme arguments. Notice that the legend position is specified in theme()
.
ggplot(data=dexp_sigtrnsc) +
geom_point(aes(x = log2FoldChange, y = log10(padj), color = Significant,
alpha = Significant)) +
scale_color_manual(values = c("black", "#e11f28")) +
scale_y_reverse(limits=c(0,-150))+
guides(alpha= "none",
color= guide_legend(title="Significant DE")) +
theme_bw() +
theme(
panel.border = element_blank(),
axis.line = element_line(),
panel.grid.major = element_line(size = 0.2),
panel.grid.minor = element_line(size = 0.1),
text = element_text(size = 12),
legend.position = "bottom",
axis.text.x = element_text(angle = 30, hjust = 1, vjust = 1)
)
Warning: The `size` argument of `element_line()` is deprecated as of ggplot2 3.4.0.
ℹ Please use the `linewidth` argument instead.
Warning: Using alpha for a discrete variable is not advised.
Lastly, let's layer another geom function to label our top six differentially abundant genes based on significance. We can use geom_text_repel()
from library(ggrepel)
, which is a variation on geom_text()
.
ggplot(data=dexp_sigtrnsc) +
geom_point(aes(x = log2FoldChange, y = log10(padj), color = Significant,
alpha = Significant)) +
geom_text_repel(data=dexp_sigtrnsc %>%
filter(gene_name %in% topgenes),
aes(x = log2FoldChange, y = log10(padj),label=gene_name),
nudge_y=0.5,hjust=0.5,direction="both",
segment.color="gray")+
scale_color_manual(values = c("black", "#e11f28")) +
scale_y_reverse(limits=c(0,-150))+
guides(alpha= "none",
color= guide_legend(title="Significant DE")) +
theme_bw() +
theme(
panel.border = element_blank(),
axis.line = element_line(),
panel.grid.major = element_line(size = 0.2),
panel.grid.minor = element_line(size = 0.1),
text = element_text(size = 12),
legend.position = "bottom",
axis.text.x = element_text(angle = 30, hjust = 1, vjust = 1)
)
Warning: Using alpha for a discrete variable is not advised.
To save the file, use ggsave()
.
Session Info
sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.7.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Berlin
tzcode source: internal
attached base packages:
[1] stats4 stats graphics grDevices utils datasets methods
[8] base
other attached packages:
[1] EnhancedVolcano_1.22.0 ggrepel_0.9.6
[3] ggfortify_0.4.17 patchwork_1.3.0
[5] DESeq2_1.44.0 SummarizedExperiment_1.34.0
[7] Biobase_2.64.0 MatrixGenerics_1.16.0
[9] matrixStats_1.4.1 GenomicRanges_1.56.2
[11] GenomeInfoDb_1.40.1 IRanges_2.38.1
[13] S4Vectors_0.42.1 BiocGenerics_0.50.0
[15] dplyr_1.1.4 ggplot2_3.5.1
loaded via a namespace (and not attached):
[1] gtable_0.3.6 xfun_0.49 htmlwidgets_1.6.4
[4] lattice_0.22-6 vctrs_0.6.5 tools_4.4.0
[7] generics_0.1.3 parallel_4.4.0 tibble_3.2.1
[10] pkgconfig_2.0.3 Matrix_1.7-1 lifecycle_1.0.4
[13] GenomeInfoDbData_1.2.12 compiler_4.4.0 farver_2.1.2
[16] stringr_1.5.1 munsell_0.5.1 codetools_0.2-20
[19] htmltools_0.5.8.1 yaml_2.3.10 pillar_1.10.0
[22] crayon_1.5.3 tidyr_1.3.1 MASS_7.3-61
[25] BiocParallel_1.38.0 DelayedArray_0.30.1 abind_1.4-8
[28] tidyselect_1.2.1 locfit_1.5-9.10 digest_0.6.37
[31] stringi_1.8.4 purrr_1.0.2 labeling_0.4.3
[34] fastmap_1.2.0 grid_4.4.0 colorspace_2.1-1
[37] cli_3.6.3 SparseArray_1.4.8 magrittr_2.0.3
[40] S4Arrays_1.4.1 withr_3.0.2 scales_1.3.0
[43] UCSC.utils_1.0.0 rmarkdown_2.29 XVector_0.44.0
[46] httr_1.4.7 gridExtra_2.3 evaluate_1.0.1
[49] knitr_1.49 rlang_1.1.4 Rcpp_1.0.13-1
[52] glue_1.8.0 rstudioapi_0.17.1 jsonlite_1.8.9
[55] R6_2.5.1 zlibbioc_1.50.0
References
Sources for content included R4DS and Holmes and Huber, 2021.