Data transformation and descriptive statistics

CVEN 5999 - Summer 2025

Lars Schöbitz

Data Organisation in Spreadsheets

Think, Pair, Share

Questions

  1. Why should you not leave a blank cell in a spreadsheet used for data collection?
  2. Which of the 12 rules for data organization was the least comprehensible to you?
  • Think for 2 minutes
  • Pair with in break-out rooms for 4 minutes
  • Share your answer with the class
02:00

Learning Objectives (for this week)

  1. Learners can apply ten functions from the dplyr R Package to generate a subset of data for use in a table or plot.

Data wrangling with dplyr

A grammar of data wrangling…

… based on the concepts of functions as verbs that manipulate data frames

  • select: pick columns by name
  • arrange: reorder rows
  • slice: chooses rows based on location
  • filter: pick rows matching criteria
  • relocate: changes the order of the columns
  • mutate: add new variables
  • summarise: reduce variables to values
  • group_by: for grouped operations
  • … (many more)

dplyr rules

Rules of dplyr functions:

  • First argument is always a data frame
  • Subsequent arguments say what to do with that data frame
  • Always return a data frame
  • Don’t modify in place

Live Coding Exercise: SDG 6.2.1

live-data-transformation

Follow along on the screen

  1. Open the GitHub organisation for the course: https://github.com/cven5999-ss25
  2. You will find a repository titled: wk-04-USERNAME (with your GitHub Username)
  3. You will “clone” this repository to Posit Cloud

Break

10:00

Pair Programming Exercise

Pair Programming Exercises

  • Two learners work together in a break out session
  • One person (the driver) shares the screen and does the typing
  • The other person (the navigator) offers comments and suggestions
  • Roles get switched

hw-data-transformation

  1. Head over to posit.cloud
  2. Open the workspace for the course cven5999-ss25
  3. Open “Content”
  4. Open your project
  5. Open the hw-data-transformation.qmd file
  6. Start working through the exercises in the file together

Homework week 4

Homework due dates

  • All material on course website
  • Homework assignment & learning reflection due: 2025-06-27

Thanks! 🌻

Slides created via revealjs and Quarto: https://quarto.org/docs/presentations/revealjs/

Access slides as PDF on GitHub

All material is licensed under Creative Commons Attribution Share Alike 4.0 International.