TY - JOUR
T1 - CopyMix : Mixture model based single-cell clustering and copy number profiling using variational inference
AU - Safinianaini, Negar
AU - De Souza, Camila P.E.
AU - Roth, Andrew
AU - Koptagel, Hazal
AU - Toosi, Hosein
AU - Lagergren, Jens
N1 - Publisher Copyright: © 2024 The Authors
PY - 2024/12
Y1 - 2024/12
N2 - Investigating tumor heterogeneity using single-cell sequencing technologies is imperative to understand how tumors evolve since each cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, which is bound to have clinical relevance. Clustering of cells based on copy number data obtained from single-cell DNA sequencing provides an opportunity to identify different tumor cell subpopulations. Accordingly, computational methods have emerged for single-cell copy number profiling and clustering; however, these two tasks have been handled sequentially by applying various ad-hoc pre- and post-processing steps; hence, a procedure vulnerable to introducing clustering artifacts. We avoid the clustering artifact issues in our method, CopyMix, a Variational Inference for a novel mixture model, by jointly inferring cell clusters and their underlying copy number profile. Our probabilistic graphical model is an improved version of the mixture of hidden Markov models, which is designed uniquely to infer single-cell copy number profiling and clustering. For the evaluation, we used likelihood-ratio test, CH index, Silhouette, V-measure, total variation scores. CopyMix performs well on both biological and simulated data. Our favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.
AB - Investigating tumor heterogeneity using single-cell sequencing technologies is imperative to understand how tumors evolve since each cell subpopulation harbors a unique set of genomic features that yields a unique phenotype, which is bound to have clinical relevance. Clustering of cells based on copy number data obtained from single-cell DNA sequencing provides an opportunity to identify different tumor cell subpopulations. Accordingly, computational methods have emerged for single-cell copy number profiling and clustering; however, these two tasks have been handled sequentially by applying various ad-hoc pre- and post-processing steps; hence, a procedure vulnerable to introducing clustering artifacts. We avoid the clustering artifact issues in our method, CopyMix, a Variational Inference for a novel mixture model, by jointly inferring cell clusters and their underlying copy number profile. Our probabilistic graphical model is an improved version of the mixture of hidden Markov models, which is designed uniquely to infer single-cell copy number profiling and clustering. For the evaluation, we used likelihood-ratio test, CH index, Silhouette, V-measure, total variation scores. CopyMix performs well on both biological and simulated data. Our favorable results indicate a considerable potential to obtain clinical impact by using CopyMix in studies of cancer tumor heterogeneity.
KW - Cancer
KW - Copy number profiling
KW - Mixture models
KW - Single-cell
KW - Tumor clonal decomposition
KW - Variational inference
UR - http://www.scopus.com/inward/record.url?scp=85208042394&partnerID=8YFLogxK
U2 - 10.1016/j.compbiolchem.2024.108257
DO - 10.1016/j.compbiolchem.2024.108257
M3 - Article
AN - SCOPUS:85208042394
SN - 1476-9271
VL - 113
SP - 1
EP - 17
JO - Computational Biology and Chemistry
JF - Computational Biology and Chemistry
M1 - 108257
ER -