---
title: "Creating a Volcano Plot"
format: html
---
## Introduction
Here we will create a volcano plot from differential expression results.
A volcano plot is a type of scatter plot commonly used in RNA-Seq analysis to examine genes that may demonstrate biological significance. Log-fold change in expression is plotted on the x-axis and statistical significance is plotted on the y-axis. Learn more about Volcano plots [here](https://training.galaxyproject.org/training-material/topics/transcriptomics/tutorials/rna-seq-viz-with-volcanoplot/tutorial.html){.external target="_blank"}.
Note: In the following plot, labels are Ensembl IDs. For a more useful figure, consider adding an annotation step.
## Create a Volcano Plot from DESeq2 differential expression results
### Load the libraries
```{r}
library(EnhancedVolcano)
library(dplyr)
```
### Load the data from command line arguments
The data were filtered to remove adjusted p-values that were NA; these were genes excluded by `DESeq2` as a part of independent filtering.
```{r}
data<-read.csv("deseq2_DEGs.csv",row.names=1) %>% filter(!is.na(padj))
```
### Plot
Create label subsets for plotting.
```{r}
labs<-head(row.names(data),5)
```
Figure 1 allows us to identify which genes are statistically significant with large fold changes.
```{r}
EnhancedVolcano(data,
title = "Enhanced Volcano with Airways",
lab = rownames(data),
selectLab=labs,
labSize=3,
drawConnectors = TRUE,
x = 'log2FoldChange',
y = 'padj')
```