Data competencies in DaF/DaZ: Exploring language technology approaches to the analysis of L2 acquisition levels in learner corpora

DAKODA is a CATALPA project.

How can the language of German learners be analyzed automatically? DAKODA is doing the groundwork here. Various learner corpora are merged into a large overall data set with search and filter functions. At the same time, young researchers in the field of German as a foreign language and German as a second language are trained in the handling of large data sets.

Project goals and research questions

The BMBF project "Data Competencies in DaF/DaZ: Exploration of Language Technology Approaches for the Analysis of L2 Acquisition Levels in Learner Corpora" (DAKODA) has been running since 1.10.2022. It is a project in the funding line "Projects for Strengthening the Data Competencies of Young Scientists". Here, young scientists who want to acquire data skills are brought together with experts in data-intensive methods. The FUH takes on the role of data expert with the executive position at the research professorship "Computational Linguistics".

  • Prof. Dr.-Ing. Torsten Zesch

  • The project is supported by the German Federal Ministry of Education and Research (BMBF) as part of the federal-state initiative "Promotion of Artificial Intelligence in Higher Education" and funded by the European Union (NextGenerationEU):

    BMBF Logo

    Funding announcement: https://www.bmbf.de/bmbf/shareddocs/bekanntmachungen/de/2021/09/2021-09-06-Bekanntmachung-Datenkompetenzen.html

    Funding code: 16DKWN035B

  • Otto-Friedrich-Universität Bamberg

  • October 2022 - September 2025

  • 2024


    • Ruppenhofer, J., Schwendemann, M., Portmann, A., Wisniewski, K., & Zesch, T. (2024). Every verb in its right place? A roadmap for operationalizing developmental stages in the acquisition of L2 German. In N. Calzolari, M.-Y. Kan, V. Hoste, A. Lenci, S. Sakti, & N. Xue (Eds.), Proceedings of the 2024 joint international conference on computational linguistics, language resources and evaluation (LREC-COLING 2024) (pp. 6655–6670). ELRA; ICCL. https://aclanthology.org/2024.lrec-main.589



    • Wisniewski, K., Zesch, T., Schwendemann, M., Ruppenhofer, J., & Portmann, A. (2023). Automatische Analysen von Erwerbsstufen in einer großen Lernerkorpus-Datenbank für DaF/DaZ. Das Forschungsprojekt DAKODA. Korpora Deutsch als Fremdsprache, 3(2, 2), 179–224. https://doi.org/10.48694/kordaf.3845