FastQC Aggregate Report provides a one webpage view of a set of FastQC reports. A typical study using NGS can result in 10s to 100s of files and results in a tedious task of reviewing QC reports and navigating the computers filesystem. FastQC Aggregate Report allows you to quick identify problematic areas and allows you to quickly drill down and review quality control reports. Color coding of summarized results help focus attention on potentially problematic data.
Features Set:
- Provides a one webpage view for a set of FastQC reports.
- Provides a table summarizing the summary statics from all of the FastQC reports. Color coding suggest areas to focus.
- Links to all of the individually detailed FastQC reports are provided. You need not drill down you file system to find each of the reports.
- Thumbnail views to all the graphics for each of the samples. Clicking on the graphic to give a whole page view.
- Supplements MultiQC report.
Software Language System Requirements Contact
Python 2
Web Browser
- Peter FitzGerald Head Genome Analysis Unit fitzgepe@helix.nih.gov
- Carl McIntosh Bioinformatics Analyst and Engineer carl.mcintosh@nih.gov
How to use FastQC Aggregate Report and FastQC
FastQC Aggregate Report is really a dashboard for a set of fastq files run through FastQC and provides an efficient way of reviewing quality control information and navigation to the detailed reports. Here are some factors that impact the quality of sequence data and downstream analysis:
- Use quality input DNA – non-degraded DNA with no contamination
- Library constructions – short libraries will result in “bleed in” of adaptor and reduce overall yield. Short library maybe unavoidable to the input material such as small RNA data.
- The experience level of SF (sequencing facility) wet-bench techniques using good reagents
- Sequencing platform for example Illumina, PacBio etc.
We recommend running Next-Generation Sequence through a rigorous quality control workflow. Here is a very basic workflow for a set of fastq files:
- Optional: Run FastQC Aggregate Report on a set of FastQC results prior to Trimming Step.
- Run Trimming Step on fastq files. This includes adapter removal and quality read trimming using a program such as Trimmomatic.
- Run FastQC Aggregate Report on the trimmed fastq files.
- Run FastQ Screen on trimmed fastq files to determine if files contain contamination from other organisms.
- Run MultiQC which also summarizes FastQC and FastQ Screen. FastQC Aggregate Report provides some features not found in MultiQC.
- Review the resulting QC information and determine if data is cleaned for downstream analysis. Long high quality reads free of contaminating genomes is the gold standard.
- A post-alignment sanity check. Use an alignment viewers such as Integrated Genomics Viewer (IGV) to make sure alignments are good.
Comparison of pre- and post-trimming (steps 1 and 2) FastQC Aggregate Reports will give you a sense of how effective the trimming stage was. Below we talk about how to review FastQC Aggregate Report and FastQC reports.
FastQC Aggregate Reports – Basic Statistics and QC Test Status Table
This table provides a very quick overview to the FastQC reports. Here is a column overview:
- Sample contains two links
- Total Sequences – the number of sequences in a fastq file. The SF (sequencing facility) should provide you with a range of expected read count. Follow up with SF if expectations are not met.
- Sequences flagged as poor quality – Follow up with SF if this value is high and Total Sequences value is low.
- Sequence length – Post trimming leads to a range of sequence or read length. The trimming step will bound the lower value if you choose a minimum read length filter. Retention of small trimmed reads can result in multi-mapping of a read against a reference genome.
- %GC – check to see if this is within range of expected value given the organism and genome regions.
- The last 11 columns in this table give an overview for the 12 sections in FastQC. They are color coded. Green is for passing metrics. Yellow (warning) and red (failing) flag areas where you need to your additional attention.
- Review and decide of these metrics are of concern.
Babraham Bioinformatics, the developers of FastQC, provides the following useful resources for interpreting FastQC reports. The links are replicated Below:
- YouTube Tutorial video overview
- Set of example reports
- Description and interpretation of each of the FastQC sections:
FastQC Aggregate Reports – Thumbnail graphics
FastQC Aggregate Reports