Automatic requirements extraction, analysis, and graph representation using an approach derived from computational linguistics

Research output: Contribution to journalArticle


Research units

  • Selko Technologies Oy
  • Tampere University of Technology
  • Institut National Polytechnique de Grenoble


The quality of requirements is fundamental in engineering projects. Requirements are usually expressed partly or totally in a natural language (NL) format and come from different documents. Their qualities are difficult to analyze manually, especially when hundreds of thousands of them have to be considered. The assistance of software tools is becoming a necessity. In this article, the goal was to develop a set of metrics supported by NL processing (NLP) methods supporting different types of analysis of requirements and especially the dependencies between requirements. An NLP approach is used to extract requirements from text; to analyze their quality, links, similarities, and contradictions; and to cluster them automatically. The analysis framework includes different combinations of methods such as cosine similarity, singular value decomposition, and K-means clustering. One objective is to assess the possible combinations and their impacts on detections to establish optimal metrics. Three case studies exemplify and support the validation of the work. Graphs are used to represent the automatically clustered requirements, as well as similarities and contradictions. A new contradiction analysis process that includes a rules-based approach is proposed. Finally, the combined results are presented as graphs, which unveil the semantic relationships between requirements. Subsection 4.8 compares the results provided by the tool and the results obtained from experts. The proposed methodology and network presentation not only support the understanding of the semantics of the requirements but also help requirements engineers to review the interconnections and consistency of requirements systems and manage traceability. The approach is valuable during the early phases of projects when requirements are evolving dynamically and rapidly.


Original languageEnglish
Pages (from-to)555-575
Number of pages21
Issue number6
Publication statusPublished - Nov 2018
MoE publication typeA1 Journal article-refereed

    Research areas

  • contradictions analysis, network representation, requirements management, similarity, MODELS

ID: 30466084