Practice Lesson 3

This practice lesson is associated with Lesson 3 of the Microbiome Analysis with QIIME 2. In this practice lesson, we will work on generating a feature table and representative sequences. We will continue working with the data from Zhang et al. 2022.

Change directories to Practice (cd Practice).
Check and trim primers and non-biological sequences. We will trim primers targeting V3-V4 (F: CCTACGGGNGGCWGCAG, R: GACTACHVGGGTATCTAATCC). You should output trimmed sequences to a directory called 02_trim.

Note: Zhang et al. 2022 stated,

The universal primers 515F 5′-GTGCCAGCMGCCGCGG-3′ and 907R 5′-CCGTCAATTCMTTTRAGTTT-3′ were applied to capture the V4–V5 region of 16S rDNA.

However, this was clearly not the case. If you use q2-cutadapt in combination with these primers, you will notice that the forward primer is found toward the center / ends of the reads and the reverse primer cannot be located.
Solution
```
mkdir 02_trim
qiime cutadapt trim-paired \
--i-demultiplexed-sequences 01_import/import.qza \
--p-front-f CCTACGGGNGGCWGCAG \
--p-front-r GACTACHVGGGTATCTAATCC \
--p-overlap 6 \
--p-discard-untrimmed \
--o-trimmed-sequences 02_trim/demux-trimmed.qza \
--verbose | tee 02_trim/cutadaptresults.log  
```

Create a new summary table.

Solution

qiime demux summarize \
--i-data 02_trim/demux-trimmed.qza \
--o-visualization 02_trim/demux-trimmed-summary.qza

Generate a feature table using DADA2 and save to 03_denoise.

Solution

mkdir 03_denoise  
# takes 30 minutes without multi-threading
qiime dada2 denoise-paired \
--i-demultiplexed-seqs 02_trim/demux-trimmed.qza \
--p-trunc-len-f 0 \
--p-trunc-len-r 0 \
--p-n-threads 2 \
--o-representative-sequences 03_denoise/asv-sequences.qza \
--o-table 03_denoise/feature-table.qza \
--o-denoising-stats 03_denoise/dada2-stats.qza

Summarize DADA2 stats using qiime metadata tabulate.
Solution
```
qiime metadata tabulate \
--m-input-file 03_denoise/dada2-stats.qza \
--o-visualization 03_denoise/dada2-stats-summ.qzv  
```
What range of percentages of sequences were retained following quality filtering, denoising, merging, and removal of chimeric sequences? (Check out the last column in your stats summary).

Solution

66.58-87.64%
Summarize your feature table and representative sequences. The path to the sample information (sample metadata) is /data/practice/metadata.txt.
Solution
```
qiime feature-table summarize \
--i-table 03_denoise/feature-table.qza \
--m-sample-metadata-file /data/practice/metadata.txt \
--o-visualization 03_denoise/feature-table-summ.qzv
qiime feature-table tabulate-seqs \
--i-data 03_denoise/asv-sequences.qza \
--o-visualization 03_denoise/asv-sequences-summ.qzv  
```
Follow-up questions:

How many unique ASVs resulted?

Solution

6,881

Did any interesting patterns emerge in read frequencies by sample?

Solution

There is an interesting bifurcation in the read frequencies between old versus young samples. This could have either an underlying biological or technical explanation.

How can we grab a quick summary of our sample metadata?

Hint

Check out qiime tools --help
Solution
```
qiime tools inspect-metadata /data/practice/metadata.txt  
```