Lesson 3 practice questions
Question 1
Import hcc1395_chr22_rna_seq_counts.csv and store it as hcc1395_chr22_counts.
Solution
import pandas
hcc1395_chr22_counts=pandas.read_csv("./hcc1395_chr22_rna_seq_counts.csv")
Question 2
How many rows and columns are in hcc1395_chr22_counts?
Solution
hcc1395_chr22_counts.shape
(1335, 7)
Question 3
What are the column names in hcc1395_chr22_counts and how to view the first 10 rows of this data set?
Solution
hcc1395_chr22_counts.head(10)
Alternatively, use hcc1395_chr22_counts.columns
to get the column headings for this data frame.
Question 4
How many genes start with the letter "C" in hcc1395_chr22_counts?
Solution
hcc1395_chr22_counts.loc[hcc1395_chr22_counts.loc[:,'Geneid'].str.startswith("C")]
Question 5
Import hcc1395_deg_chr22.csv and store it as hcc1395_deg_chr22.
Solution
hcc1395_deg_chr22=pandas.read_csv("./hcc1395_deg_chr22.csv")
Question 6
Remove ".bam" from the column headers of hcc1395_deg_chr22.
Solution
hcc1395_deg_chr22.columns=hcc1395_deg_chr22.columns.str.replace(".bam", "")
Question 7
Subset out the following columns from hcc1395_deg_chr22 and store it as hcc1395_deg_chr22_1.
- name
- log2FoldChange
- PAdj
Solution
hcc1395_deg_chr22_1=hcc1395_deg_chr22.loc[:,["name", "log2FoldChange", "PAdj"]]
Use the .head
function to check of the subsetting was done correctly.
hcc1395_deg_chr22_1.head()
Question 8
Add a column to hcc1395_deg_chr22_1 that contains the negative log10 of the PAdj value.
Solution
import numpy
hcc1395_deg_chr22_1["-log10PAdj"]=numpy.negative(numpy.log10(hcc1395_deg_chr22_1.loc[:,"PAdj"]))