General description
This module will offer an introduction to linear modeling in R, i.e. the unified approach underlying t-tests, ANOVA and linear regression. Each lesson consists of a combination of theory and hands-on exercises in RStudio.
As outcome of the course, you will be able to:
- analyze continuous data using linear models (includes assumption checks, data transformation, contrast coding, and interactions)
- interpret analyses in a meaningful way
- communicate results using quantitative summaries and data visualizations
- conduct open and reproducible research
Topics:
- introduction to linear modeling with one predictor (includes examples with one categorical and one continuous predictor)
- multiple linear regression (includes examples with multiple categorical predictors, multiple continuous predictors, and categorical + continuous predictors)
- summarizing and visualizing predictor effects
- open research
As from 2024 onwards, this module can be followed together with module 10 Multivariate data analysis with R in a complete Expert track for this Summer School.
Target audience
Anyone interested in analyzing continuous data with linear models (i.e. ANOVA, linear regression, …).
Course prerequisites
A good working knowledge of both R and basic statistics. Ideally, you already have followed courses similar to module 1 Introduction to R and module 12 Introduction to statistics with R, part of the Novice track in this Summer School.
A computer with R (at least version 4.5) and RStudio installed, as well as the R packages tidyverse and lme4.
Course materials
All course materials will be provided online and include slides, exercises and solutions, R markdown files, datasets and further reading.
Recommended but optional handbook: Winter, Bodo (2020). Statistics for Linguists: An Introduction using R. New York (USA): Routledge.
Teacher bio
Vinicius Macuch Silva is a post-doctoral researcher associated with the Institute of Linguistics of the Goethe University Frankfurt, Germany. His research deals with various aspects of linguistic meaning and how people use language in different communicative settings (e.g. in face-to-face interaction, on the web, etc.). In particular, he is interested in how people produce and interpret polysemous and polyfunctional linguistic forms in context. He primarily uses quantitative empirical methods, including controlled experimentation as well as statistical and corpus-analytic methods, to investigate questions related to pragmatic inferencing, strategic communication, argumentation and stance-taking, as well as expressive and affective meaning.
Schedule
- Wednesday 15/07/2026, 14:00-15:30 & 16:00-17:30
- Thursday 16/07/2026, 9:00-10:30 & 11:00-12:30 & 14:00-15:30 & 16:00-17:30
- Friday 17/07/2026, 9:00-10:30 & 11:00-12:30 & 14:00-15:30 & 16:00-17:30
In addition to these contact hours this module expects some time for self-study in accordance with the relevant background and individual needs of each student.