Exploiting data reduction principles in cloud-based data management for cryo-image data

Kashish Ara Shakil, Mansaf Alam, Shabih Shakeel, Ari Ora, Samiya Khan

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

2 Citations (Scopus)


Cloud computing is a cost-effective way for start-up life sciences laboratories to store and manage their data. However, in many instances the data stored over the cloud could be redundant which makes cloud-based data management inefficient and costly because one has to pay for every byte of data stored over the cloud. Here, we tested efficient management of data generated by an electron cryo-microscopy (cryoEM) lab on a cloud-based environment. The test data was obtained from cryoEM repository EMPIAR. All the images were subjected to an in-house parallelized version of principal component analysis. An efficient cloud-based MapReduce modality was used for parallelization. We showed that large data in order of terabytes could be efficiently reduced to its minimal essential self in a cost-effective scalable manner. Furthermore, on-spot instance on Amazon EC2 was shown to reduce costs by a margin of about 27 percent. This approach could be scaled to data of any large volume and type.

Original languageEnglish
Title of host publicationProceedings of the 2018 International Conference on Computers in Management and Business, ICCMB 2018
Number of pages6
ISBN (Print)9781450364232
Publication statusPublished - 25 May 2018
MoE publication typeA4 Article in a conference publication
EventInternational Conference on Computers in Management and Business - Oxford, United Kingdom
Duration: 25 May 201827 May 2018


ConferenceInternational Conference on Computers in Management and Business
Abbreviated titleICCMB
CountryUnited Kingdom


  • Big Data
  • Cloud Computing
  • Cryo-image data
  • Data reduction
  • PCA

Fingerprint Dive into the research topics of 'Exploiting data reduction principles in cloud-based data management for cryo-image data'. Together they form a unique fingerprint.

Cite this