Questions and Answers
This is a collection of questions and answers from the workshop.
-
What happens if the host DNA cannot be removed?
If you expect a lot of host contamination (some should always be expected - more or less depending on sample type), you may need to sequence your sample more deeply to achieve enough reads to represent the diversity of the microbiome. In either case, you can remove host DNA computationally as long as you have a host genome or one closely related to your host.
-
What is the difference between a read, an assembly, and a contig?
Short read sequencers can only read small piece of DNA, so it generates small reads representing the genomic content of your sample. These are generally about 150 bp for shotgun but the size can vary. You use different tools to stitch them back together and create longer pieces of DNA, called contiguous sequences (contigs) which can be 100,000bp or higher in metagenomes! All the contigs are considered the assembly. ("Assembled reads") Assemblies give you a more complete picture - for example, much more information when looking at a 100,000 bp then only 150bp!
-
What version of QIIME is Nephele using?
Nephele uses a QIIME2 environment and VSEARCH for clustering and generating OTUs, so you will see it referred to as qiime2/vearch on the platform. However, there is also a separate DADA2 pipeline. Many users employ DADA2 to generate ASVs in the QIIME2 environment, and you can do the same thing by running the DADA2 pipeline in Nephele.
-
Do we need Biowulf to use Nephele?
You do not need to use Biowulf to use Nephele. It is available as a free web application open to all. However, you can transfer data from Biowulf or other HPCs to Nephele using Globus. See the Day 2 recording for a demo of this funcitonality.
-
What is the difference between a mapping file and data file during data upload with Nephele?
The mapping file describes the data files to be uploaded. This requires a unique identifier for each sample (SampleID) and the file names of the data files to be uploaded. The data files are the actual sequence files (FASTQ) that you want to upload. The mapping file can also include additional metadata about your samples. For more information, see the data upload section of the Nephele User Guide.
-
Does Nephele include a host decontamination database for non-human primates?
Yes, there are multiple options for host decontamination databases in Nephele, including non-human primates. See the WGSA2 pipeline options. See the "Host Decontamination DB (Options)" section.
-
I have data from JAMS. What data can I use from JAMS for Nephele?
The data used to run JAMS can be used with Nephele. To learn more about JAMS, see the event recording and slides from a past event, Streamlining microbial shotgun analysis with JAMS - from fastqs to pdfs.
If you have additional questions, please contact the NCI BTEP at ncibtep@nih.gov. For questions specific to Nephele, you can also contact the Nephele team at Nephelesupport@nih.gov.