Importing Data to Project and Assigning Metadata
The next step is to import data to the project. Click on the "Add data" button and select "Bulk". RNA sequencing is the default option and since FASTQ files will be imported, leave the "fastq" radio button selected. Click "Next" when ready. In the next page, users can navigate the Partek Flow folder of their own Biowulf account to select the needed files. Specify that the data is mRNA and hit "Finish" when ready. As the data is importing, users will see a rectangular task node. Once the data has successfully imported, the rectangular task node will turn into a circular data node.
After the FASTQ files have been imported, it is time to assign metadata to the files to help keep track of what condition each file came from. To do this, click on the "Metadata" tab in the project analysis page. Once in the "Metadata" page, click on "Show data files" and users will see the two paired end FASTQ files associated with the sample. Partek Flow uses the portion of the filename before "_R1.fq" and "_R2.fq" as the sample name. This class will assign metadata using the "Assign values from file" options as this is more convenient. The metadata are available in the tab delimited file "hcc1395_phenotype.txt" in the instructor's ./PartekFlow/uploads/hcc1395
folder. The contents of the file are below. Samples that start with "n" are normal and those starting with "t" are tumors, thus in this dataset there are 3 normal and 3 tumor samples. In either case, select "hcc1395_phenotype.txt" and click on "Next" when finished. In the next page, check the import box associated appropriated with the "Attribute name" or variable, which in this case is "disease_type" as there is already a column name "sample" containing the sample names. Click import when ready.
sample disease_type
n1 normal
n2 normal
n3 normal
t1 tumor
t2 tumor
t3 tumor