Courses

Participants select either one module or two modules provided that the timeslots do not overlap (see Programme).

There is also the option to follow a whole “track” of modules as a full curriculum. Depending on your background knowledge and interests, you can select the track which best suits you. As from 2026, the Summer School offers two tracks for (introductory-level vs. advanced) statistical training and a new, third track for NLP:

Module 1: Introduction to R

R is a widely-used programming language for statistical data analysis. This beginner-friendly module aims to provide participants with a solid foundation in R, empowering them to explore, analyze, and visualize data efficiently. Emphasis is placed on hands-on practical exercises and real-world examples, enabling students to immediately apply their knowledge.

read more …

Module 2: Regression analysis with R

This module will offer an introduction to linear modeling in R, i.e. the unified approach underlying t-tests, ANOVA and linear regression.

read more …

Module 3: Natural Language Processing

This module introduces students to the core principles and methods of natural language processing (NLP) and data-driven AI.

read more …

Module 4: PRAAT

This module will introduce Praat scripting. By using scripts it will be much easier to replicate your analyses on speech files and to communicate with others about what you have done and how you have done it.
read more …

Module 5: ELAN and FLEx

This module is not offered in 2026.

Module 6: Eye-tracking

Over the last decades, eye tracking has become a wide-spread technique to understand how people process and learn language. In this module, we will cover a basic introduction to eye-tracking techniques in language science.
read more …

Module 7: Survey design

This module provides a comprehensive introduction to survey research, covering the entire process from designing questionnaires to preparing data for analysis.

read more …

Module 8: Linguistic ethnography

This module is not offered in 2026.

Module 9: Bayesian data analysis

This module is not offered in 2026.

Module 10: Multivariate data analysis with R

This module offers an overview of the most important techniques for analyzing multivariate data, i.e. data involving several (correlated) variables. Such multivariate data arise often in studies involving language, e.g. research into language attitudes, reaction times to stimuli or cooccurrence frequencies in corpora. In addition, word embeddings in NLP share various ideas with multivariate statistical techniques so these similarities will also be touched upon.

read more …

Module 11: Advanced predictive modeling

This module introduces aspects and applications of predictive modeling that go beyond what is normally encountered in statistical modeling in linguistics: multiple post-hoc testing, a priori orthogonal contrasts and general linear hypothesis tests, non-linearities, Poisson regression and tree-based approaches.

read more …

Module 12: Introduction to statistics with R

In this module, you will learn how to analyse linguistic data with R and Rstudio. Topics are descriptive statistics and distributions, the sampling distribution and inferential statistics (interval estimation and significance testing, t-tests, chi-squared tests, linear regression).

read more …

Module 13: Introduction to Python

This module provides a beginner-friendly introduction to programming with Python, with an emphasis on automatic text processing.

read more …