Participants select either one module or two modules provided that the timeslots do not overlap (see Programme).
There is also the option to follow a whole “track” of modules as a full curriculum. Depending on your background knowledge and interests, you can select the track which best suits you. As from 2026, the Summer School offers two tracks for (introductory-level vs. advanced) statistical training and a new, third track for NLP:
- Novice track = Module 1 Introduction to R + Module 12 Introduction to statistics with R
- Expert track = Module 10 Multivariate data analysis with R + Module 2 Regression analysis with R
- NLP track = Module 3 Natural Language Processing + Module 13 Introduction to Python
Module 1: Introduction to R
R is a widely-used programming language for statistical data analysis. This beginner-friendly module aims to provide participants with a solid foundation in R, empowering them to explore, analyze, and visualize data efficiently. Emphasis is placed on hands-on practical exercises and real-world examples, enabling students to immediately apply their knowledge.
Module 2: Regression analysis with R
This module will offer an introduction to linear modeling in R, i.e. the unified approach underlying t-tests, ANOVA and linear regression.
Module 3: Natural Language Processing
This module introduces students to the core principles and methods of natural language processing (NLP) and data-driven AI.
Module 4: PRAAT
This module will introduce Praat scripting. By using scripts it will be much easier to replicate your analyses on speech files and to communicate with others about what you have done and how you have done it.
read more …
Module 5: ELAN and FLEx
This module is not offered in 2026.
Module 6: Eye-tracking
Over the last decades, eye tracking has become a wide-spread technique to understand how people process and learn language. In this module, we will cover a basic introduction to eye-tracking techniques in language science.
read more …
Module 7: Survey design
This module provides a comprehensive introduction to survey research, covering the entire process from designing questionnaires to preparing data for analysis.
Module 8: Linguistic ethnography
This module is not offered in 2026.
Module 9: Bayesian data analysis
This module is not offered in 2026.
Module 10: Multivariate data analysis with R
This module offers an overview of the most important techniques for analyzing multivariate data, i.e. data involving several (correlated) variables. Such multivariate data arise often in studies involving language, e.g. research into language attitudes, reaction times to stimuli or cooccurrence frequencies in corpora. In addition, word embeddings in NLP share various ideas with multivariate statistical techniques so these similarities will also be touched upon.
Module 11: Advanced predictive modeling
This module introduces aspects and applications of predictive modeling that go beyond what is normally encountered in statistical modeling in linguistics: multiple post-hoc testing, a priori orthogonal contrasts and general linear hypothesis tests, non-linearities, Poisson regression and tree-based approaches.
Module 12: Introduction to statistics with R
In this module, you will learn how to analyse linguistic data with R and Rstudio. Topics are descriptive statistics and distributions, the sampling distribution and inferential statistics (interval estimation and significance testing, t-tests, chi-squared tests, linear regression).
Module 13: Introduction to Python
This module provides a beginner-friendly introduction to programming with Python, with an emphasis on automatic text processing.