Donate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation

Krister Lindén, Tommi Jauhiainen, Mietta Lennes, Mikko Kurimo, Aleksi Rossi, Tommi Kurki, Olli Pitkänen

Research output: Chapter in Book/Report/Conference proceedingChapterScientificpeer-review

57 Downloads (Pure)

Abstract

The Donate Speech campaign aimed to collect 10 000 hours of ordinary, casual Finnish speech to be used for studying language as well as for developing technology and services that can be readily used in the languages spoken in Finland. In this project, particular attention has been paid to allowing for both academic and commercial use of the material. Even though the ambitious target currently seems to evade us, the Donate Speech campaign has managed to collect an extensive resource of more than 3500 h of Finnish colloquial speech with more than 200 000 speech recordings by roughly 50 000 speakers from all over Finland in just a few months.
Original languageEnglish
Title of host publicationCLARIN : the infrastructure for language resources
PublisherDe Gruyter
Number of pages30
ISBN (Electronic)978-3-11-076737-7
ISBN (Print)978-3-11-076734-6
DOIs
Publication statusPublished - Oct 2022
MoE publication typeA3 Book section, Chapters in research books

Publication series

NameDigital Linguistics
Volume1
ISSN (Electronic)2751-1278

Keywords

  • speech resources
  • colloquial speech
  • large-scale data collection
  • academic and commercial use

Fingerprint

Dive into the research topics of 'Donate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation'. Together they form a unique fingerprint.

Cite this