Abstract
Combinatorial problems are everywhere, starting from operations-research problems such as finding connectivity for efficient network design, placement of facilities such that the travelling distance from clients to facilities is minimized, detecting patterns in database transactions and genetic material, contact tracing in epidemic- propagation networks, and efficient advertising in social media, among others. Many of these aforementioned problems are known to be computationally intractable, meaning that, no efficient solutions are known to solve these problems. Despite being intractable, these problems have enormous impact on the society and our day-to-day activities, which prompts us to find effective solutions that can practically solve large instances of these real-world problem scenarios.
A broad topic of this thesis is to design algorithms that can practically solve a set of computationally-intractable problems under restrictive settings, using mathematical abstractions and computational tools. In particular, the problems of our interest include pattern detection in graphs and data clustering with fairness constraints. Pattern detection in graphs is the problem of deciding the existence of a smaller subgraph in a usually larger graph. The problem has applications in understanding complex biological and metabolic systems, discovering controversies in social media discussions, studying epidemic propagation in contact networks, among others. Clustering with fairness constraints is the problem of grouping similar objects together in to clusters, while simultaneously taking a fairness notion into consideration. The fairness notion arise from different societal norms, including, avoiding discrimination based on gender, race, ethnicity and financial status, among others.
This thesis is organized around many algorithmic techniques and computational tools for the design of algorithms that can practically solve large instances of intractable problems. These computational tools include, parameterized algorithms, approximation algorithms, heuristics, and implementation engineering. We introduce novel-problem formulations in the field of pattern detection and data clustering, analyze the computational complexity of problems that we introduce, and present algorithmic solutions that can solve these problems under restrictive settings, that is, when certain parameters of the problem are small. In cases where it is not possible to design efficient algorithmic solutions, we present careful validations in the form of computational-complexity results, lower bound on the running time, and lower bound on the approximation factor. Finally, we present carefully engineered implementation of our proposed algorithmic solutions, and validate our scalability claims using both synthetic and real-world datasets.
Translated title of the contribution | Scalable algorithm designs for mining massive datasets |
---|---|
Original language | English |
Qualification | Doctor's degree |
Awarding Institution |
|
Supervisors/Advisors |
|
Publisher | |
Print ISBNs | 978-952-64-0942-9 |
Electronic ISBNs | 978-952-64-0943-6 |
Publication status | Published - 2022 |
MoE publication type | G5 Doctoral dissertation (article) |
Keywords
- algorithmic bias
- algorithmic fairness
- algebraic fingerprinting
- combinatorial optimization
- parameterized algorithms
- scalable algorithms