Computationally efficient joint species distribution modeling of big spatial data

Gleb Tikhonov*, Li Duan, Nerea Abrego, Graeme Newell, Matt White, David Dunson, Otso Ovaskainen

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

12 Citations (Scopus)
21 Downloads (Pure)


The ongoing global change and the increased interest in macroecological processes call for the analysis of spatially extensive data on species communities to understand and forecast distributional changes of biodiversity. Recently developed joint species distribution models can deal with numerous species efficiently, while explicitly accounting for spatial structure in the data. However, their applicability is generally limited to relatively small spatial data sets because of their severe computational scaling as the number of spatial locations increases. In this work, we propose a practical alleviation of this scalability constraint for joint species modeling by exploiting two spatial-statistics techniques that facilitate the analysis of large spatial data sets: Gaussian predictive process and nearest-neighbor Gaussian process. We devised an efficient Gibbs posterior sampling algorithm for Bayesian model fitting that allows us to analyze community data sets consisting of hundreds of species sampled from up to hundreds of thousands of spatial units. The performance of these methods is demonstrated using an extensive plant data set of 30,955 spatial units as a case study. We provide an implementation of the presented methods as an extension to the hierarchical modeling of species communities framework.

Original languageEnglish
Article number02929
Number of pages8
Issue number2
Early online date2019
Publication statusPublished - 1 Feb 2020
MoE publication typeA1 Journal article-refereed


  • community modeling
  • ecological communities
  • Gaussian process
  • hierarchical modeling of species communities
  • joint species distribution model
  • latent factors
  • spatial statistics


Dive into the research topics of 'Computationally efficient joint species distribution modeling of big spatial data'. Together they form a unique fingerprint.

Cite this