Algorithms for Order-Preserving Matching

Tamanna Chhabra

Research output: ThesisDoctoral ThesisCollection of Articles

Abstract

String matching is a widely studied problem in Computer Science. There have been many recent developments in this field. One fascinating problem considered lately is the order-preserving matching (OPM) problem. The task is to find all the substrings in the text which have the same length and relative order as the pattern, where the relative order is the numerical order of the numbers in a string. The problem finds its applications in the areas involving time series or series of numbers. More specifically, it is useful for those who are interested in the relative order of the pattern and not in the pattern itself. For example, it can be used by analysts in a stock market to study movements of prices. In addition to the OPM problem, we also studied its approximate variation. In approximate order-preserving matching, we search for those substrings in the text which have relative order similar to the pattern, i.e., relative order of the pattern matches with at most k mismatches. With respect to applications of order-preserving matching, approximate search is more meaningful than exact search. We developed various advanced solutions for the problem and its variant. Special emphasis was laid on the practical efficiency of the solutions. Particularly, we introduced a simple solution for the OPM problem using filtration. We proved experimentally that our method was effective and faster than the previous solutions for the problem. In addition, we combined the Single Instruction Multiple Data (SIMD) instruction set architecture with filtration to develop competent solutions which were faster than our previous solution. Moreover, we proposed another efficient solution without filtration using the SIMD architecture. We also presented an offline solution based on the FM-index scheme. Furthermore, we proposed practical solutions for the approximate order-preserving matching problem and one of the solutions was the first sublinear solution on average for the problem.
Translated title of the contributionAlgorithms for Order-Preserving Matching
Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
Supervisors/Advisors
  • Tarhio, Jorma, Supervising Professor
  • Tarhio, Jorma, Thesis Advisor
Publisher
Print ISBNs978-952-60-6828-2
Electronic ISBNs978-952-60-6829-9
Publication statusPublished - 2016
MoE publication typeG5 Doctoral dissertation (article)

Keywords

  • string matching
  • indexing
  • SIMD
  • filtration

Fingerprint Dive into the research topics of 'Algorithms for Order-Preserving Matching'. Together they form a unique fingerprint.

Cite this