Abstract
Suffix trees provide for efficient indexing of numerous sequence processing problems in biological databases. We address the pivotal issue of improving the search efficiency of disk-resident suffix trees by improving the storage layout from a statistical learning viewpoint. In particular, we make the following contributions: we (a) introduce the Q-Optimal Disk Layout(Q-OptDL) problem in the context of suffix trees and prove it to be NP-Hard, and (b) propose an algorithm for improving the layout of suffix trees that is guaranteed to perform asymptotically no worse than twice the optimal disk layout.
Original language | English |
---|---|
Title of host publication | Simulated Evolution and Learning - 8th International Conference, SEAL 2010, Proceedings |
Pages | 711-715 |
Number of pages | 5 |
DOIs | |
Publication status | Published - 2010 |
MoE publication type | A4 Conference publication |
Event | International Conference on Simulated Evolution and Learning - Kanpur, India Duration: 1 Dec 2010 → 4 Dec 2010 Conference number: 8 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 6457 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | International Conference on Simulated Evolution and Learning |
---|---|
Abbreviated title | SEAL |
Country/Territory | India |
City | Kanpur |
Period | 01/12/2010 → 04/12/2010 |
Keywords
- 0/1 Knapsack
- Statistical Learning
- Suffix Trees