Extraction of Entities and Concepts from Finnish Texts.

Minna Tamper

Tutkimustuotos: Master's thesis

Abstrakti

Keywords are used in many document databases to improve search. The process of assigning keywords from controlled vocabularies to a document is called subject indexing. If the controlled vocabulary used for indexing is an ontology, with semantic relations and descriptions of concepts, the process is also called semantic annotation. In this thesis an automatic annotation tool was created to provide the documents with semantic annotations. The application links entities found from the texts to ontologies defined by the user. The application is highly configurable and can be used with different Finnish texts. The application was developed as a part of WarSampo and Semantic Finlex projects and tested using Kansa Taisteli magazine articles and consolidated legislation of Finnish legislation. The quality of the automatic annotation was evaluated by measuring precision and recall against existing manual annotations. The results showed that the quality of the input text, as well as the selection and configuration of the ontologies impacted the results.
AlkuperäiskieliEnglanti
PätevyysMaisteritutkinto
Myöntävä instituutio
  • Perustieteiden korkeakoulu
Valvoja/neuvonantaja
  • Mäkelä, Eetu, Vastuuprofessori
  • Tuominen, Jouni, Vastuuprofessori
  • Hyvönen, Eero, Vastuuprofessori
TilaJulkaistu - jouluk. 2016
OKM-julkaisutyyppiG2 Pro gradu, diplomityö, ylempi amk-opinnäytetyö

Sormenjälki

Sukella tutkimusaiheisiin 'Extraction of Entities and Concepts from Finnish Texts.'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.

Siteeraa tätä