Acknowledgements
DIDACTICS · CONTENT · TOOLING · INSPIRATION
The people whose work made this possible
Nothing on this site is original in isolation. Every choice — how the labs are paced, how the prose is written, which methods got a full lab and which got a paragraph — comes from somewhere, and that somewhere deserves to be named.
Didactics and pedagogy
The shape of the labs — same five-step template, same “plot before the formula” rhythm, same reluctance to ask a reader to memorise before they can experiment — is drawn from a small number of teachers whose public materials have set the standard for contemporary data-science and statistics teaching.
Teach the teacher
Mine Çetinkaya-Rundel — Data Science in a Box
The modern canon on how to teach data science openly. Every lab in this curriculum carries some Mine DNA: the discipline of writing learning objectives first, the habit of simulating before modelling, and the expectation that any serious teaching artefact is itself reproducible from source.
Opinionated biostats teaching
Biostats-R (Bergen)
A living ecosystem of openly licensed biosciences / biostatistics courses from the University of Bergen. Their approach to lab modularity — each session self-contained, each concept introduced once and referenced thereafter — informs the structure of every course here.
Community-driven R teaching
eR-Biostat
A volunteer-maintained set of biostatistics teaching materials. Their commitment to freely redistributable content, built from plain-text sources, is the norm this site tries to honour.
Undergraduate entry point
UBC Okanagan — BIOL202
One of the clearest first-pass introductions to biostatistics on the open web. Course 1 borrows its instinct to name the workflow stages explicitly before teaching any of them.
Short, opinionated primer
Sheffield — Introductory Biostatistics with R
Short, opinionated, and almost completely free of jargon. The tone of Course 1’s backgrounds owes it a visible debt.
Publishing system
Quarto
The reason every lab here renders twice from one source, the reason the slides and the article never drift out of sync, and the reason this site can be built by anyone with quarto render. Quarto’s team has raised the ceiling of what an openly published curriculum can look like.
Free courseware at scale
Posit Education
Posit’s education team maintains a quantity of high-quality, openly licensed teaching resources that is frankly hard to overstate. The patterns for running renv, setting up a Quarto project, and wiring a CI render came from their materials first.
Structural inspiration
Chi Zhang & OCBE — MF9130E
Chi Zhang and the OCBE team at the University of Oslo run an open-source Introductory Course in Statistics with a public GitHub repository. Their site provided the layout pattern this one starts from: course landing pages, linked schedules, per-session labs, a GitHub Pages build.
Content depth
Which topics got a full lab, which methods are taught the “Harrell way” versus the “Hastie way”, which trade-offs are named on every slide — this bucket collects the texts and courses that shaped the what of the curriculum rather than the how.
Applied biostatistics
Frank Harrell — Biostatistics for Biomedical Research (BBR)
The densest, most opinionated, and most useful applied-statistics text in medicine. Much of what Course 2 and Course 4 teach about calibration, decision analysis, and prediction is a simplified retelling of BBR chapters.
Regression modelling
Frank Harrell — Regression Modeling Strategies
The spline-fitting, calibration, and validation discipline of Course 2 W3–W4 is straight RMS.
Statistical learning
Hastie, Tibshirani, Witten, James — ISLR2
Course 4’s structure — regularisation → trees → interpretability → modern methods — is an homage to ISLR2. Their commitment to teaching the intuition before the linear algebra made our life easier at every step.
Causal inference
Hernán & Robins — Causal Inference: What If
The DAG, propensity-score, and g-methods labs in Course 3 are restatements of What If at a pace that fits a single evening of reading. The original remains non-optional for anyone making causal claims in earnest.
Cochrane methodology
Cochrane Handbook
Course 3 W4 on systematic reviews and meta-analysis follows Cochrane conventions directly.
Tooling
Every lab on this site exists because free, high-quality open-source software exists. We cite the tools we lean on hardest:
- R Core Team and the CRAN maintainers, for the language and its package archive.
- Bioconductor, for the genomics tooling referenced in Course 4 W4.
- The tidyverse team (Hadley Wickham and colleagues) for
dplyr,ggplot2,tidyr,purrr,readr, andtibble. - The authors of
lme4,brms,mgcv,survival,glmnet,metafor,dagitty,mice,DESeq2,Seurat,tidymodels— every one of them gets a citation viacitation()in the session-info block at the foot of every lab. - Posit (formerly RStudio) for RStudio, Quarto, and Shiny.
- Pandoc and the Reveal.js project, for making the dual-format render of every lab possible.
Personal
- Colleagues and students who read early drafts of individual labs and flagged everything from typos to circular arguments.
- Everyone who opened an issue or a pull request against this repository.
- The anonymous reviewers whose criticism, over the years, shaped the definition of “useful” that this curriculum now tries to meet.
Any errors here are our own. If you spot one, please open an issue.
Licence
This curriculum is released under the MIT licence. You are free to use, adapt, and redistribute it, including in commercial or institutional teaching contexts, as long as the copyright notice and these acknowledgements travel with it.