NCI Center for Cancer Research (CCR) scientists have many software options for analyzing Next Generation Sequencing (NGS) data. While some require expertise in programming, others provide a user-friendly, point-and-click interface. These options include programs for building full NGS analyses workflows, elucidating molecular pathways and networks influenced by changes in gene expression, building of phylogenetic trees, and in silico molecular cloning. This topic spotlight provides a high-level overview of each point-and-click package.
Software Available to All NCI Scientists:
Partek Flow enables scientists to build analysis workflows for data derived from sequencing modalities such as DNA, bulk RNA, single cell RNA, spatial transcriptomics, CITE, ChIP, and ATAC. Partek Flow has a built-in genome browser, produces publication quality visualizations (e.g., PCA, heatmap, and volcano plots), and provides insights to biological functions impacted by different conditions. In addition, Partek Flow enables analysis of microarray data from CEL (Affymetrix) and IDAT (Illumina arrays) files. The NIH High Performance Computing Cluster (HPC) Biowulf hosts Partek Flow, providing investigators with abundant compute resources for analyzing big data while using a point-and-click interface. Check out BTEP’s documentation titled “Introducing Partek Flow: using Bulk RNA Sequencing Analysis as a Guide” to learn how to conduct NGS analysis using this software. Material from this class is broadly applicable to analyzing other NGS data using Partek Flow.
The second software that allows for building of Next Generation Sequencing analysis workflow is Qiagen’s CLC Genomics Workbench. Unlike Partek Flow at NCI, this application requires installation on local computer where compute resources may limit analysis of large genomic datasets. Despite this, CLC Genomics Workbench has a range of NGS analysis capabilities including DNA variant, bulk RNA, ChIP, and ATAC sequencing analysis. Users can perform reference guided mapping or de novo assembly of NGS and analysis of long read sequencing data. This software also enables completion of traditional molecular tasks such as cloning, BLAST, identification of open reading frames, microarray analysis, and PCR primer design. Plug-ins for single cell RNA and ATAC sequencing analysis are also available. Please contact Qiagen support at ts-bioinformatics@qiagen.com to check whether these are included in the NCI CLC Genomics Workbench license.
Qiagen’s Ingenuity Pathway Analysis (IPA) empowers researchers to discover affected pathways, networks, diseases, regulatory mechanisms, biomarkers, and drug targets using data generated from studies that examined how gene, protein, or metabolite levels change under various biological settings. Users supply a table in CSV or tab-delimited TXT format with columns listing identifiers, measured change, and statistical confidence of measured change for genes, proteins, or metabolites of interest (e.g. RNA sequencing differential gene expression analysis results table). Researchers can also perform keyword searches on IPA’s vast knowledgebase to learn how molecules and genes influence diseases or biological functions. In this way, the software can be used for discovery even when scientists do not have their own data.
Software for NCI CCR Researchers Only:
Qlucore Omics Explorer is a visualization software for data derived from many platforms including bulk RNA-seq, single cell RNA-seq, ChIP-seq, ATAC-seq, proteomics, and metabolomics. Users start by supplying a data table containing measurements of gene or protein expression (e.g., RNA sequencing gene expression counts table), metabolite levels, or number of peaks detected per genomic region (ChIP and ATAC-seq experiments). From there, Qlucore Omics Explorer’s suite of statistical, gene set enrichment and ontology tools generate graphical results that help researchers capture biological insights such as differentially expressed genes in an RNA sequencing experiment. Built-in machine learning algorithms allow investigators to classify samples and cells based on expression of genes, proteins, or metabolites.
Geneious Prime is another desktop software that can handle NGS analysis, although it is more limited as it can only handle DNA and bulk RNA sequencing. In addition, it can de novo assemble sequences or perform reference guided alignment. Further, Geneious Prime can be used for molecular biology tasks such as BLAST, multi-sequence alignment, PCR primer design, prediction and identification of DNA elements, and phylogenetics.
Other software available to NCI scientists include Biodiscovery Nexus Copy Number, Snap Gene, and Easy Panel. A description of these is provided below.
- Biodiscovery Nexus Copy Number is a point-and-click software that runs on local computer. It specializes in copy number variant (CNV) discovery. It takes NGS alignment BAM files as well as array derived data from Affymetrix (CEL), Illumina (IDAT), Agilent, and Roche NimbleGen.
- Snap Gene specializes in plasmid design, molecular cloning, and PCR primer design. It enables researchers to simulate agarose gels.
- Easy Panel can be used to design flow cytometry panels.
Note that the software packages have in common the ability to accept many input data types, thus allowing users to start their NGS analysis at any stage. For instance, starting with BAM files rather than FASTQ. The software capability matrix will help users select the suitable package. The guide “Getting Started with Partek Flow” contains instructions for accessing Partek Flow. For all other software, submit a ticket with https://service.cancer.gov/ncisp to request installation onto local computer.
Training Opportunities:
BTEP frequently invites software vendors to conduct training. Visit our Video Archive to view recordings from past trainings (hint: use the search box to find trainings for a particular software). Refer to the BTEP calendar to find upcoming trainings. Vendor websites also house information for future and recordings from past trainings. Researchers can sign-up to receive periodic notifications about trainings that are not organized by BTEP (see below for links to vendor websites). Finally, do not hesitate to contact BTEP at ncibtep@nih.gov for questions and to request consultations.
Websites for Vendor Organized Trainings and Technical Support Contact:
- Partek Flow:
- Contact technical support: techsupport@illumina.com
- Illumina hosted trainings: https://help.partek.illumina.com/partek-flow/live-training-event-recordings
- Qiagen CLC Genomics Workbench:
- Contact technical support: ts-bioinformatics@qiagen.com
- Qiagen hosted trainings: https://tv.qiagenbioinformatics.com/channel/61793068/qiagen-clc-genomics
- Qiagen IPA:
- Contact technical support: ts-bioinformatics@qiagen.com
- Qiagen hosted trainings: https://tv.qiagenbioinformatics.com/channel/61793047/qiagen-ipa
- Qlucore Omics Explorer:
- Contact technical support: https://qlucore.com/contact
- Qlucore hosted trainings: https://qlucore.com/webinars-and-events
- Geneious Prime:
- Contact technical support: https://help.geneious.com/hc/en-us/requests/new
- Geneious Prime hosted trainings: https://www.geneious.com/academy
- Snap Gene:
- Contact technical support: https://support.snapgene.com/hc/en-us/requests/new
- Snap Gene hosted trainings: https://www.snapgene.com/resources
- Biodiscovery Nexus Copy Number:
- Contact technical support: support@biodiscovery.com
- Biodiscovery hosted trainings: https://bionano.com/video/nexus-copy-number-software/
- Easy Panel:
- Contact technical support: https://www.easypanel.ai/contact/
- Easy Panel hosted trainings: https://www.easypanel.ai/resources/
–Alex Emmons, PhD (BTEP), Amy Stonelake, PhD (BTEP), Joe Wu, PhD (BTEP)