Capstone Project

The Capstone Project report is the last assignment of the course and contributes 40% to the total grade (see Course Overview). The final two weeks are reserved for working on the report.

Learning Objectives

  1. Learners can apply the skills obtained during the course to write a short data analysis project report.

GitHub repository

A repository for each student is established during week 4 of the course (see HW 04b: Capstone Project Setup). This repository contains self-identified data that was shared in week 3 of the course. Some data cleaning steps are performed by the course instructor, so that each student has a tidy dataset to work with.

The repository further contains an empty Quarto file (report.qmd), which students must use to write their report. A list of graded items was prepared to guide students in writing the report.

GitHub issue tracker

The GitHub issue tracker of each student’s capstone project repository is used to communicate and ask questions about the Capstone Project report.

Submission due date

The due date for submission of the report is 2025-07-25. Any commits after the due date will not be counted against the graded items.

Graded items

Table 1 is a detailed list of items for grading of the capstone project report are presented. The sum of points is 100 and spread over technical and intellectual items as presented in Table 2.

Table 1: List of items and points for grading of the capstone project report.
category items points
technical The report renders without errors to HTML format and contains at least four chapters of heading level 1 that are named: Introduction, Methods, Results, Conclusions. 10
technical YAML header of report has title, author, date, and table of contents that are correctly displayed in the compiled HTML output. 8
technical Warnings are hidden from the compiled output, but code is shown in the compiled output. 2
technical The report has at least two data visualisations. 20
technical Each data visualisation has edited human-readable labels (e.g. axis labels, legend title). 4
technical Each data visualisation applies at least of once scaling function (e.g. color/fill, axes). 4
technical Each data visualisation has a label defined in the code-chunk options. 4
technical Each data visualisation has a caption defined in the code-chunk options. 4
technical Each data visualisation has an alternative text that describes what type of visualisation it is and trends or learnings from the visualisation. 4
technical Each data visualisation is cross-referenced in the narrative using the defined label from the code-chunk options. 4
technical The report has at least one table with summary statistics (e.g. count, mean, median, standard deviation, etc.) 8
technical Each table is formatted in the rendered output using a function taught during the course (e.g. kable() function or gt() function) 2
technical Each table has a label defined in the code-chunk options. 2
technical Each table has a caption defined in the code-chunk options. 2
technical Each table is cross-referenced in the narrative using the defined label from the code-chunk options. 2
intellectual Introduction section with 3 to 5 sentences introduces the context within which the data was created. 5
intellectual Methods section describes in 3 to 5 sentences how the data was obtained. 5
intellectual Figures and tables in Results section are interpreted with 2 to 3 sentences each. 5
intellectual Conclusions concisely summarize findings in a bullet point format. 5
Table 2: Sum of points for technical and intellectual part of report.
category description points
intellectual This item is part of the intellectual framing of the capstone project report. 20
technical This item is a technical part of the capstone project report. 80

Directory structure

Once your capstone project repository is set up (completed in HW 04b), organize your project with the following directory structure:

Required folders

  • data/ - Store your datasets and related files
  • docs/ - Documentation and supplementary materials

Data folder organization

Your data/ folder should be well-documented to ensure reproducibility. Create a README.md file in the data/ folder that includes:

  • Data source information: Where the data comes from
  • Data collection methods: How the data was collected
  • Variable descriptions: What each column/variable represents
  • Data processing notes: Any cleaning or transformation steps
  • File descriptions: What each data file contains
  • Date information: When the data was collected/last updated

Use the template from: https://raw.githubusercontent.com/rbtl-dev/metadata-readme-template/main/README.md

Getting started

  1. Access your capstone project repository on Posit Cloud
  2. Ensure you have completed the directory structure setup from HW 04b
  3. Begin working on your report.qmd file using the graded items below as guidance