Corpus-based historical linguistics

September 5, 2013, HU Berlin, Germany
Dorotheenstr. 24, 1.308

Historical linguistics is in a way the most fundamental form of corpus linguistics, because the linguistic data that can be studied is only available as handed-down text. Therefore, a historical linguist -- not unlike a corpus linguist -- has to perform a detailed analysis of a corpus of the texts in order to do a linguistic study.

One of the ubiquitous tasks in historical linguistics is to make and consult such corpora. At first, these corpora were philological jewels and took the form of editions, dictionaries and grammars. Over the course of the last decades, a shift -- or perhaps a fork? -- in the medium of these corpora has taken place. Nowadays, historical corpora and corpora in general are available in a machine-readable format, to facilitate search and multi-level linguistic annotation.

Together with this change in ways of data preservation, a shift in the research methodology took place. The fact that searching in texts has become so much easier not only yielded new theoretical insights, but also allows for a more quantitative approach to diachronic processes.

The workshop aims to get a good overview of the more innovative theory-driven research questions that are being asked today. There will be ample time to discuss the topics brought up by the speakers.


For the purpose of the workshop, 6 high-level scholars were invited to present their work, and to discuss the work of others.


The workshop takes place on September 5, 2013 in room 1.308 of the Dorotheenstraße 24.

09:00 - 09:30 Welcome coffee and introduction
09:30 - 10:30 Benedikt Szmrecsanyi
Exploring cross-constructional variation and change
10:30 - 11:30 Katrin Axel-Tober
Recent advances in the syntax of German subordination – the mutual usefulness of synchronic and corpus-based diachronic approaches
11:30 - 12:30 Freek Van de Velde
Degeneracy as an adaptive strategy in language change
12:30 - 14:30 Lunch break and one-on-one interaction possibility between audience and speakers
14:30 - 15:30 Martin Hilpert
Using motion charts for hypothesis-testing and the exploration of language change
15:30 - 16:30 Cerstin Mahlow
Retrieval and Annotation of German Phrasemes in Heterogeneous Diachronic TEI Corpora
16:30 - 17:30 Hendrik De Smet
How gradual change progresses. The expansion of ‑ing-clauses with begin through time and across individuals


The workshop is freely accessible by anyone interested. There is no registration necessary, but a quick notification of attendance is useful for planning purposes.

The lunch break takes place in Cum Laude, just across the University Building where the workshop takes place.

The workshop is organized by Dr. Tom Ruette, Humboldt University of Berlin, with funding by the faculty. For questions about the workshop, please mail to tom (dot) ruette (at) hu-berlin (dot) de.

Other information

On September 6, 2013, the Third International Workshop on Systems and Frameworks for Computational Morphology takes place at the Humboldt University of Berlin.

On September 12-14, 2013, the fifth installment of the conference Quantitative Investigations in Theoretical Linguistics (QITL-5) takes place at the University of Leuven, Belgium