Welcome!

This website is dedicated to spoken corpora, mainly for Albanian, Bosnian and Serbian. But there are also other corpora and other resources for these languages here.

I am currently working on making the corpora available in TEI XML and to integrate NoSketchEngine and SpoCo into the website in order to make all corpora searchable online. Some corpora already exist in SketchEngine and can be shared on request.

The following spoken corpora are available at the moment:

  • BosCo – Spoken Corpus of Bosnian
  • CALD – Dialect Corpus of Albanian
  • CRONUS – A corpus for the analysis of Serbian spoken narratives
  • SrMaCo – Spoken Corpus of the Serbian minority in Hungary

The following written corpora are available or under construction:

Furthermore, the following resources aer provided here: