Skip to content

Practice Lesson 3

This practice lesson is associated with Lesson 3 of the Microbiome Analysis with QIIME 2. In this practice lesson, we will work on generating a feature table and representative sequences. We will continue working with the data from Zhang et al. 2022.

  1. Change directories to Practice (cd Practice).
  2. Check and trim primers and non-biological sequences. We will trim primers targeting V3-V4 (F: CCTACGGGNGGCWGCAG, R: GACTACHVGGGTATCTAATCC). You should output trimmed sequences to a directory called 02_trim.

    Note: Zhang et al. 2022 stated,

    The universal primers 515F 5′-GTGCCAGCMGCCGCGG-3′ and 907R 5′-CCGTCAATTCMTTTRAGTTT-3′ were applied to capture the V4–V5 region of 16S rDNA.

    However, this was clearly not the case. If you use q2-cutadapt in combination with these primers, you will notice that the forward primer is found toward the center / ends of the reads and the reverse primer cannot be located.

    Solution
    mkdir 02_trim
    qiime cutadapt trim-paired \
    --i-demultiplexed-sequences 01_import/import.qza \
    --p-front-f CCTACGGGNGGCWGCAG \
    --p-front-r GACTACHVGGGTATCTAATCC \
    --p-overlap 6 \
    --p-discard-untrimmed \
    --o-trimmed-sequences 02_trim/demux-trimmed.qza \
    --verbose | tee 02_trim/cutadaptresults.log  
    

  3. Create a new summary table.

    Solution
    qiime demux summarize \
    --i-data 02_trim/demux-trimmed.qza \
    --o-visualization 02_trim/demux-trimmed-summary.qza  
    

  4. Generate a feature table using DADA2 and save to 03_denoise.

    Solution
    mkdir 03_denoise  
    # takes 30 minutes without multi-threading
    qiime dada2 denoise-paired \
    --i-demultiplexed-seqs 02_trim/demux-trimmed.qza \
    --p-trunc-len-f 0 \
    --p-trunc-len-r 0 \
    --p-n-threads 2 \
    --o-representative-sequences 03_denoise/asv-sequences.qza \
    --o-table 03_denoise/feature-table.qza \
    --o-denoising-stats 03_denoise/dada2-stats.qza   
    

  5. Summarize DADA2 stats using qiime metadata tabulate.

    Solution
    qiime metadata tabulate \
    --m-input-file 03_denoise/dada2-stats.qza \
    --o-visualization 03_denoise/dada2-stats-summ.qzv  
    

    What range of percentages of sequences were retained following quality filtering, denoising, merging, and removal of chimeric sequences? (Check out the last column in your stats summary).

    Solution
    66.58-87.64%

  6. Summarize your feature table and representative sequences. The path to the sample information (sample metadata) is /data/practice/metadata.txt.

    Solution
    qiime feature-table summarize \
    --i-table 03_denoise/feature-table.qza \
    --m-sample-metadata-file /data/practice/metadata.txt \
    --o-visualization 03_denoise/feature-table-summ.qzv
    qiime feature-table tabulate-seqs \
    --i-data 03_denoise/asv-sequences.qza \
    --o-visualization 03_denoise/asv-sequences-summ.qzv  
    

    Follow-up questions:

    How many unique ASVs resulted?

    Solution
    6,881

    Did any interesting patterns emerge in read frequencies by sample?

    Solution
    There is an interesting bifurcation in the read frequencies between old versus young samples. This could have either an underlying biological or technical explanation.

    How can we grab a quick summary of our sample metadata?

    Hint
    Check out qiime tools --help
    Solution
    qiime tools inspect-metadata /data/practice/metadata.txt