ncibtep@nih.gov

Bioinformatics Training and Education Program

Partek Flow Quick Start Guide

Partek Flow enables scientists to build comprehensive workflows for analyzing multi-omics high throughput sequencing data including DNA and variant calling, bulk and single cell modalities for RNA, ChIP, and ATAC, spatial transcriptomics, CITE, and immune cell receptor repertoire. Being a point-and-click software, Partek Flow is suitable for biologists who wish to avoid the steep learning curve associated with analyzing sequencing data using command line or code. This article provides researchers with information for accessing Partek Flow, transferring data to the NIH Partek Flow server, and training opportunities.

NCI owns an institutional license for Partek Flow. To access this software, follow the steps below.

1. Obtain a Biowulf (the NIH High Performance Computing cluster) account.

2. Ensure that the “data” directory on Biowulf has enough disk space to hold Partek Flow files.

3. Request a Partek Flow account.

    • Please contact staff@hpc.nih.gov to acquire an account after steps 1 and 2 have been completed.

Once the steps above have been done, paste https://partekflow.cit.nih.gov/flow into a web browser to sign onto the NIH Partek Flow server and start using this software. Because Partek Flow is hosted on Biowulf, scientists have much more compute power available for analyzing large genomic datasets as compared to a personal computer. Investigators do not need expertise in Biowulf to use this package. The essential things are 1) to know that the analysis input and output are stored on Biowulf, 2) the ability to sign onto Biowulf, and 3) familiarity with navigating through the Biowulf directory structure as well as performing data management tasks (i.e. copy, delete, and move files/folders).

The first step in an analysis is to transfer data to the NIH Partek Flow server. See https://partekflow.cit.nih.gov/ to learn about the methods for accomplishing this. However, as many NCI CCR scientists will use the NCI CCR Sequencing Facility, this article will highlight the transfer of data from the sequencing facility’s Data Management Environment (DME) to the Partek Flow server using Globus, which involves the following steps.

1.  Install the Globus Desktop Client. See https://hpc.nih.gov/docs/globus/setup.php for instructions.
2.  Refer to https://bioinformatics.ccr.cancer.gov/docs/getting-started-with-partek-flow/data_transfer/ for instructions on transferring data from the NCI CCR Sequencing Facility’s DME to the Partek Flow server using Globus.

Throughout the year, BTEP hosts trainings taught by Partek scientists. Recordings from previous Partek Flow trainings can be found at the BTEP Video Archive. Examples of past trainings include bulk RNA, single cell RNA, spatial transcriptomics, ChIP, and ATAC sequencing analysis. Please check the BTEP calendar (https://bioinformatics.ccr.cancer.gov/btep/) for upcoming trainings. Partek Flow regularly updates their documentations (https://documentation.partek.com/display/FLOWDOC/Partek+Flow+Documentation). For questions, contact Partek scientist at support@partek.com or email ncibtep@nih.gov and a BTEP member will connect researchers to a Partek scientist.

Note that NCI researchers should use the NCI Partek Flow license, please contact BTEP (ncibtep@nih.gov) with questions. Other NIH institutions that hold licenses to Partek Flow include NHGRI (https://research.nhgri.nih.gov/bi-training.shtml). Please contact bioinformatics@nhgri.nih.gov for details on the NHGRI Partek Flow license. Investigators affiliated with ICs other than NCI and NHGRI can use Partek Flow through the NIH Library. Please contact Doug Joubert (douglas.joubert@nih.gov) or see https://www.nihlibrary.nih.gov/resources/tools/partek-flow for instructions on access.

Joe Wu (BTEP)