11. Advanced quality control with MultiQC copy
This page uses content directly from the Biostar Handbook by Istvan Albert.
Start by activating the bioinfo environment.
conda activate bioinfo
Create a new directory for the multiqc data.
mkdir multi
cd multi
curl http://data.biostarhandbook.com/data/sequencing-platform-data.tar.gz --output sequencing-platform-data.tar.gz
tar zxvf sequencing-platform-data.tar.gz
fastqc --extract illumina.fq iontorrent.fq
Now we can use MultiQC to combine the data report directories.
Try installing the MultiQC program this way.
pip install multiqc
conda create -n qc python=3.7
conda activate qc
conda install multiqc
multiqc --help
multiqc illumina_fastqc iontorrent_fastqc
Using the web browser, find the file "multiqc_report.html" and open it.
General Statistics
FASTQC Sequence Counts
Sequence Quality Histograms
Per Sequence Quality Scores
Per Base Sequence Content
Per Sequence GC Content
FASTQC: Per Base N Content
Sequence Length Distribution
Sequence Duplication Levels
Overrepresented sequences
Adapter Content
Status Checks