Exact diagonalization of quantum lattice models on coprocessors

Tutkimustuotos: Lehtiartikkelivertaisarvioitu

Tutkijat

  • T. Siro
  • A. Harju

Organisaatiot

Kuvaus

We implement the Lanczos algorithm on an Intel Xeon Phi coprocessor and compare its performance to a multi-core Intel Xeon CPU and an NVIDIA graphics processor. The Xeon and the Xeon Phi are parallelized with OpenMP and the graphics processor is programmed with CUDA. The performance is evaluated by measuring the execution time of a single step in the Lanczos algorithm. We study two quantum lattice models with different particle numbers, and conclude that for small systems, the multi-core CPU is the fastest platform, while for large systems, the graphics processor is the clear winner, reaching speedups of up to 7.6 compared to the CPU. The Xeon Phi outperforms the CPU with sufficiently large particle number, reaching a speedup of 2.5.

Yksityiskohdat

AlkuperäiskieliEnglanti
Sivut274-281
Sivumäärä8
JulkaisuComputer Physics Communications
Vuosikerta207
TilaJulkaistu - 1 lokakuuta 2016
OKM-julkaisutyyppiA1 Julkaistu artikkeli, soviteltu

ID: 9102840