Abstract
Minimum flow decomposition (MFD) is a common problem across various fields of Computer Science, where a flow is decomposed into a minimum set of weighted paths. However, in Bioinformatics applications, such as RNA transcript or quasi-species assembly, the flow is erroneous since it is obtained from noisy read coverages. Typical generalizations of the MFD problem to handle errors are based on least-squares formulations or modelling the erroneous flow values as ranges. All of these are thus focused on error handling at the level of individual edges. In this paper, we interpret the flow decomposition problem as a robust optimization problem and lift error-handling from individual edges to solution paths. As such, we introduce a new minimum path-error flow decomposition problem, for which we give an Integer Linear Programming formulation. Our experimental results reveal that our formulation can account for errors significantly better, by lowering the inaccuracy rate by 30-50% compared to previous error-handling formulations, with computational requirements that remain practical.
Original language | English |
---|---|
Pages (from-to) | 1-14 |
Number of pages | 14 |
Journal | IEEE/ACM Transactions on Computational Biology and Bioinformatics |
DOIs | |
Publication status | E-pub ahead of print - 2024 |
MoE publication type | A1 Journal article-refereed |
Keywords
- Flow decomposition
- integer programming
- network flow
- uncertainty