Skip to content

Exercise 3: Lesson 4

Loading data

The data used in this practice exercise can be found here.

Q1. Import data from the sheet "iris_data_long" from the excel workbook (file_path = "./data/iris_data.xlsx"). Make sure the column names are unique and do not contain spaces. Save the imported data to an object called iris_long.

Q1: Solution
iris_long<-readxl::read_excel("../data/iris_data.xlsx",sheet="iris_data_long",.name_repair="universal",skip=3)
## New names:
## • `Iris ID` -> `Iris.ID`
## • `Measurement location` -> `Measurement.location`
iris_long
## # A tibble: 600 × 4
##    Iris.ID Species Measurement.location Measurement
##      <dbl> <chr>   <chr>                      <dbl>
##  1       1 setosa  Sepal.Length                 5.1
##  2       1 setosa  Sepal.Width                  3.5
##  3       1 setosa  Petal.Length                 1.4
##  4       1 setosa  Petal.Width                  0.2
##  5       2 setosa  Sepal.Length                 4.9
##  6       2 setosa  Sepal.Width                  3  
##  7       2 setosa  Petal.Length                 1.4
##  8       2 setosa  Petal.Width                  0.2
##  9       3 setosa  Sepal.Length                 4.7
## 10       3 setosa  Sepal.Width                  3.2
## # ℹ 590 more rows

Q2. Import a tab delimited file (file_path= "./data/species_datacarpentry.txt"). Save the file to an object named species. genus,species, and taxa should be converted to factors upon import.

Q2: Solution
species<-readr::read_delim("../data/species_datacarpentry.txt",col_types="cfff")
species
## # A tibble: 54 × 4
##    species_id genus            species         taxa   
##    <chr>      <fct>            <fct>           <fct>  
##  1 AB         Amphispiza       bilineata       Bird   
##  2 AH         Ammospermophilus harrisi         Rodent 
##  3 AS         Ammodramus       savannarum      Bird   
##  4 BA         Baiomys          taylori         Rodent 
##  5 CB         Campylorhynchus  brunneicapillus Bird   
##  6 CM         Calamospiza      melanocorys     Bird   
##  7 CQ         Callipepla       squamata        Bird   
##  8 CS         Crotalus         scutalatus      Reptile
##  9 CT         Cnemidophorus    tigris          Reptile
## 10 CU         Cnemidophorus    uniparens       Reptile
## # ℹ 44 more rows

Q3. Load in a comma separated file with row names present (file_path= "./data/countB.csv") and save to an object named countB.

Q3: Solution
countB<-read.csv("../data/countB.csv",row.names=1)
head(countB)
##        SampleA_1 SampleA_2 SampleA_3 SampleB_1 SampleB_2 SampleB_3
## Tspan6       703       567       867        71       970       242
## TNMD         490       482        18       342       935       469
## DPM1         921       797       622       661         8       500
## SCYL3        335       216       222       774       979       793
## FGR          574       574       515       584       941       344
## CFH          577       792       672       104       192       936

Challenge data load

Q4. Load in a tab delimited file (file_path= "./data/WebexSession_report.txt") using read_delim(). You will need to troubleshoot the error message and modify the function arguments as needed.

Q4: Solution
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
read_delim("../data/WebexSession_report.txt",delim="\t",locale = locale(encoding = 'UTF-16'),skip=2) #via readr
## Rows: 10 Columns: 21
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: "\t"
## chr   (7): Name, Date, Invited, Registered, Duration, Network joined from:, ...
## dbl   (1): Participant
## lgl  (11): Audio Type, Email, Company, Title, Phone Number, Address 1, Addre...
## time  (2): Start time, End time
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## # A tibble: 10 × 21
##    Participant `Audio Type` Name     Email Date  Invited Registered `Start time`
##          <dbl> <lgl>        <chr>    <lgl> <chr> <chr>   <chr>      <time>      
##  1           1 NA           Partici… NA    6/8/… No      N/A        13:00       
##  2           2 NA           Partici… NA    6/9/… <NA>    <NA>       13:00       
##  3           3 NA           Partici… NA    6/10… No      N/A        12:57       
##  4           4 NA           Partici… NA    6/11… <NA>    <NA>       12:57       
##  5           5 NA           Partici… NA    6/12… No      N/A        12:55       
##  6           6 NA           Partici… NA    6/13… <NA>    <NA>       12:55       
##  7           7 NA           Partici… NA    6/14… No      N/A        12:32       
##  8           8 NA           Partici… NA    6/15… <NA>    <NA>       12:32       
##  9           9 NA           Partici… NA    6/16… Yes     N/A        12:42       
## 10          NA NA           <NA>     NA    <NA>  <NA>    <NA>          NA       
## # ℹ 13 more variables: `End time` <time>, Duration <chr>, Company <lgl>,
## #   Title <lgl>, `Phone Number` <lgl>, `Address 1` <lgl>, `Address 2` <lgl>,
## #   City <lgl>, `State/Province` <lgl>, `Zip/Postal Code` <lgl>,
## #   `Country/region` <lgl>, `Network joined from:` <chr>,
## #   `Internal Participant:` <chr>