Exercise 3: Lesson 4
Loading data
The data used in this practice exercise can be found here.
Q1. Import data from the sheet "iris_data_long" from the excel workbook (file_path = "./data/iris_data.xlsx"). Make sure the column names are unique and do not contain spaces. Save the imported data to an object called iris_long
.
Q1: Solution
iris_long<-readxl::read_excel("../data/iris_data.xlsx",sheet="iris_data_long",.name_repair="universal",skip=3)
## New names:
## • `Iris ID` -> `Iris.ID`
## • `Measurement location` -> `Measurement.location`
iris_long
## # A tibble: 600 × 4
## Iris.ID Species Measurement.location Measurement
## <dbl> <chr> <chr> <dbl>
## 1 1 setosa Sepal.Length 5.1
## 2 1 setosa Sepal.Width 3.5
## 3 1 setosa Petal.Length 1.4
## 4 1 setosa Petal.Width 0.2
## 5 2 setosa Sepal.Length 4.9
## 6 2 setosa Sepal.Width 3
## 7 2 setosa Petal.Length 1.4
## 8 2 setosa Petal.Width 0.2
## 9 3 setosa Sepal.Length 4.7
## 10 3 setosa Sepal.Width 3.2
## # ℹ 590 more rows
Q2. Import a tab delimited file (file_path= "./data/species_datacarpentry.txt"). Save the file to an object named species
. genus
,species
, and taxa
should be converted to factors upon import.
Q2: Solution
species<-readr::read_delim("../data/species_datacarpentry.txt",col_types="cfff")
species
## # A tibble: 54 × 4
## species_id genus species taxa
## <chr> <fct> <fct> <fct>
## 1 AB Amphispiza bilineata Bird
## 2 AH Ammospermophilus harrisi Rodent
## 3 AS Ammodramus savannarum Bird
## 4 BA Baiomys taylori Rodent
## 5 CB Campylorhynchus brunneicapillus Bird
## 6 CM Calamospiza melanocorys Bird
## 7 CQ Callipepla squamata Bird
## 8 CS Crotalus scutalatus Reptile
## 9 CT Cnemidophorus tigris Reptile
## 10 CU Cnemidophorus uniparens Reptile
## # ℹ 44 more rows
Q3. Load in a comma separated file with row names present (file_path= "./data/countB.csv") and save to an object named countB
.
Q3: Solution
countB<-read.csv("../data/countB.csv",row.names=1)
head(countB)
## SampleA_1 SampleA_2 SampleA_3 SampleB_1 SampleB_2 SampleB_3
## Tspan6 703 567 867 71 970 242
## TNMD 490 482 18 342 935 469
## DPM1 921 797 622 661 8 500
## SCYL3 335 216 222 774 979 793
## FGR 574 574 515 584 941 344
## CFH 577 792 672 104 192 936
Challenge data load
Q4. Load in a tab delimited file (file_path= "./data/WebexSession_report.txt") using read_delim()
. You will need to troubleshoot the error message and modify the function arguments as needed.
Q4: Solution
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.4
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
read_delim("../data/WebexSession_report.txt",delim="\t",locale = locale(encoding = 'UTF-16'),skip=2) #via readr
## Rows: 10 Columns: 21
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr (7): Name, Date, Invited, Registered, Duration, Network joined from:, ...
## dbl (1): Participant
## lgl (11): Audio Type, Email, Company, Title, Phone Number, Address 1, Addre...
## time (2): Start time, End time
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## # A tibble: 10 × 21
## Participant `Audio Type` Name Email Date Invited Registered `Start time`
## <dbl> <lgl> <chr> <lgl> <chr> <chr> <chr> <time>
## 1 1 NA Partici… NA 6/8/… No N/A 13:00
## 2 2 NA Partici… NA 6/9/… <NA> <NA> 13:00
## 3 3 NA Partici… NA 6/10… No N/A 12:57
## 4 4 NA Partici… NA 6/11… <NA> <NA> 12:57
## 5 5 NA Partici… NA 6/12… No N/A 12:55
## 6 6 NA Partici… NA 6/13… <NA> <NA> 12:55
## 7 7 NA Partici… NA 6/14… No N/A 12:32
## 8 8 NA Partici… NA 6/15… <NA> <NA> 12:32
## 9 9 NA Partici… NA 6/16… Yes N/A 12:42
## 10 NA NA <NA> NA <NA> <NA> <NA> NA
## # ℹ 13 more variables: `End time` <time>, Duration <chr>, Company <lgl>,
## # Title <lgl>, `Phone Number` <lgl>, `Address 1` <lgl>, `Address 2` <lgl>,
## # City <lgl>, `State/Province` <lgl>, `Zip/Postal Code` <lgl>,
## # `Country/region` <lgl>, `Network joined from:` <chr>,
## # `Internal Participant:` <chr>