Highly scalable parallel collaborative filtering algorithm

Ankur Narang*, Raj Gupta, Anupam Joshi, Vikas K. Garg

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionScientificpeer-review

12 Citations (Scopus)


Collaborative filtering (CF) based recommender systems have gained wide popularity in Internet companies like Amazon, Netflix, Google News, and others. These systems make automatic predictions about the interests of a user by inferring from information about like-minded users. Real-time CF on highly sparse massive datasets, while achieving a high prediction accuracy, is a computationally challenging problem. In this paper, we present the design of a soft real-time (around 1 min.) parallel CF algorithm based on the Concept Decomposition technique [1]. Our parallel algorithm has been optimized for multicore/many-core architectures while maintaining the prediction accuracy of 0.84 RMSE. Using the Netflix dataset, we demonstrate the performance and scalability of our algorithm (in both batch mode and online mode) on a 32-core Power6 based SMP system. Our parallel algorithm delivered training time of 64s on the full Netflix dataset and prediction time of 4.5s on 1.4M ratings (3.2μ s per rating prediction). This is 12.6x better than the best known sequential training time and around 33x better than the best known sequential prediction time [2], along with high accuracy (0.84 RMSE). To the best of our knowledge, this is also the best known parallel performance at such high accuracy.

Original languageEnglish
Title of host publication17th International Conference on High Performance Computing, HiPC 2010
Publication statusPublished - 2010
MoE publication typeA4 Conference publication
EventInternational Conference on High Performance Computing - Goa, India
Duration: 19 Dec 201022 Dec 2010
Conference number: 17


ConferenceInternational Conference on High Performance Computing
Abbreviated titleHiPC


Dive into the research topics of 'Highly scalable parallel collaborative filtering algorithm'. Together they form a unique fingerprint.

Cite this