TY - JOUR
T1 - Structure discovery in Bayesian networks by sampling partial orders
AU - Niinimäki, Teppo
AU - Parviainen, Pekka
AU - Koivisto, Mikko
PY - 2016/4/1
Y1 - 2016/4/1
N2 - We present methods based on Metropolis-coupled Markov chain Monte Carlo (MC3) and annealed importance sampling (AIS) for estimating the posterior distribution of Bayesian networks. The methods draw samples from an appropriate distribution of partial orders on the nodes, continued by sampling directed acyclic graphs (DAGs) conditionally on the sampled partial orders. We show that the computations needed for the sampling algorithms are feasible as long as the encountered partial orders have relatively few down-sets. While the algorithms assume suitable modularity properties of the priors, arbitrary priors can be handled by dividing the importance weight of each sampled DAG by the number of topological sorts it has - we give a practical dynamic programming algorithm to compute these numbers. Our empirical results demonstrate that the presented partial-order-based samplers are superior to previous Markov chain Monte Carlo methods, which sample DAGs either directly or via linear orders on the nodes. The results also suggest that the convergence rate of the estimators based on AIS are competitive to those of MC3. Thus AIS is the preferred method, as it enables easier large-scale parallelization and, in addition, supplies good probabilistic lower bound guarantees for the marginal likelihood of the model.
AB - We present methods based on Metropolis-coupled Markov chain Monte Carlo (MC3) and annealed importance sampling (AIS) for estimating the posterior distribution of Bayesian networks. The methods draw samples from an appropriate distribution of partial orders on the nodes, continued by sampling directed acyclic graphs (DAGs) conditionally on the sampled partial orders. We show that the computations needed for the sampling algorithms are feasible as long as the encountered partial orders have relatively few down-sets. While the algorithms assume suitable modularity properties of the priors, arbitrary priors can be handled by dividing the importance weight of each sampled DAG by the number of topological sorts it has - we give a practical dynamic programming algorithm to compute these numbers. Our empirical results demonstrate that the presented partial-order-based samplers are superior to previous Markov chain Monte Carlo methods, which sample DAGs either directly or via linear orders on the nodes. The results also suggest that the convergence rate of the estimators based on AIS are competitive to those of MC3. Thus AIS is the preferred method, as it enables easier large-scale parallelization and, in addition, supplies good probabilistic lower bound guarantees for the marginal likelihood of the model.
KW - Annealed importance sampling
KW - Directed acyclic graph
KW - Fast zeta transform
KW - Linear extension
KW - Markov chain Monte Carlo
UR - http://www.scopus.com/inward/record.url?scp=84979920501&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:84979920501
SN - 1532-4435
VL - 17
JO - Journal of Machine Learning Research
JF - Journal of Machine Learning Research
ER -