Efficient Methods on Reducing Data Redundancy in the Internet

Research output: ThesisDoctoral ThesisCollection of Articles


  • Sumanta Saha

Research units


The transformation of the Internet from a client-server based paradigm to a content-based one has led to many of the fundamental network designs becoming outdated. The increase in user-generated contents, instant sharing, flash popularity, etc., brings forward the needs for designing an Internet which is ready for these and can handle the needs of the small-scale content providers. The Internet, as of today, carries and stores a large amount of duplicate, redundant data, primarily due to a lack of duplication detection mechanisms and caching principles. This redundancy costs the network in different ways: it consumes energy from the network elements that need to process the extra data; it makes the network caches store duplicate data, thus causing the tail of the data distribution to be swapped out of the caches; and it causes the content-servers to be loaded more as they have to always serve the less popular contents. In this dissertation, we have analyzed the aforementioned phenomena and proposed several methods to reduce the redundancy of the network at a low cost. The proposals involve different approaches to do so--including data chunk level redundancy detection and elimination, rerouting-based caching mechanisms in information-centric networks, and energy-aware content distribution techniques. Using these approaches, we have demonstrated how we can perform redundancy elimination using a low overhead and low processing power. We have also demonstrated that by using local or global cooperation methods, we can increase the storage efficiency of the existing caches many-fold. In addition to that, this work shows that it is possible to reduce a sizable amount of traffic from the core network using collaborative content download mechanisms, while reducing client devices' energy consumption simultaneously.


Original languageEnglish
QualificationDoctor's degree
Awarding Institution
  • Aalto University
Print ISBNs978-952-60-6421-5
Electronic ISBNs978-952-60-6422-2
Publication statusPublished - 2015
MoE publication typeG5 Doctoral dissertation (article)

    Research areas

  • cache, redundancy, energy, ICN, Internet

ID: 18374884