Toward optimal disk layout of genome scale suffix trees

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

Abstract

Suffix trees provide for efficient indexing of numerous sequence processing problems in biological databases. We address the pivotal issue of improving the search efficiency of disk-resident suffix trees by improving the storage layout from a statistical learning viewpoint. In particular, we make the following contributions: we (a) introduce the Q-Optimal Disk Layout(Q-OptDL) problem in the context of suffix trees and prove it to be NP-Hard, and (b) propose an algorithm for improving the layout of suffix trees that is guaranteed to perform asymptotically no worse than twice the optimal disk layout.

Original languageEnglish
Title of host publicationSimulated Evolution and Learning - 8th International Conference, SEAL 2010, Proceedings
Pages711-715
Number of pages5
DOIs
Publication statusPublished - 2010
MoE publication typeA4 Conference publication
EventInternational Conference on Simulated Evolution and Learning - Kanpur, India
Duration: 1 Dec 20104 Dec 2010
Conference number: 8

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6457 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceInternational Conference on Simulated Evolution and Learning
Abbreviated titleSEAL
Country/TerritoryIndia
CityKanpur
Period01/12/201004/12/2010

Keywords

  • 0/1 Knapsack
  • Statistical Learning
  • Suffix Trees

Fingerprint

Dive into the research topics of 'Toward optimal disk layout of genome scale suffix trees'. Together they form a unique fingerprint.

Cite this