Abstract
Biclustering is the unsupervised learning task of mining a data matrix for useful submatrices, for instance groups of genes that are co-expressed under particular biological conditions. As these submatrices are expected to partly overlap, a significant challenge in biclustering is to develop methods that are able to detect overlapping biclusters. The authors propose a probabilistic mixture modelling framework for biclustering biological data that lends itself to various data types and allows biclusters to overlap. Their framework is akin to the latent feature and mixture-of-experts model families, with inference and parameter estimation being performed via a variational expectation-maximization algorithm. The model compares favorably with competing approaches, both in a binary DNA copy number variation data set and in a miRNA expression data set, indicating that it may potentially be used as a general-problem solving tool in biclustering.
Original language | English |
---|---|
Journal | International Journal of Knowledge Discovery in Bioinformatics |
Volume | 6 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2016 |
MoE publication type | A1 Journal article-refereed |