Identification of DNA N4-methylcytosine sites via multi-view kernel sparse representation model

Chengwei Ai, Prayag Tiwari*, Hongpeng Yang, Yijie Ding, Jijun Tang, Fei Guo

*Corresponding author for this work

Research output: Contribution to journalArticleScientificpeer-review

1 Citation (Scopus)


Identifying DNA N4-methylcytosine (4mC) sites is of great significance in biological research, such as chromatin structure, DNA stability, DNA-protein interaction and controlling gene expression. However, the traditional sequencing technology to identify 4mC sites is very time-consuming. In order to detect 4mC sites, we develop a multi-view learning method for achieving more effectively via merging multiple feature spaces. Furthermore, we think about whether the multi-view learning method can improve the across species classification ability by fusing data of multiple species. In our study, we propose a multi-view Laplacian kernel sparse representation-based classifier, called MvLapKSRC-HSIC. First, we make use of three feature extraction methods (PSTNP, NCP, DPP) to extract the DNA sequence features. MvLapKSRC-HSIC uses a kernel sparse representation-based classifier with graph regularization. In order to maintain the independence between various views, we add a multi-view regularization term constructed by Hilbert-Schmidt independence criterion (HSIC). In the experiments, MvLapKSRC-HSIC is applied on six datasets, so as to compare with other popular methods in single species and cross-species experiments. All experimental results show that MvLapKSRC-HSIC is superior to other outstanding methods on both single species and cross-species. Importantly, MvLapKSRC-HSIC can identify a series of potential DNA 4mC sites, which have not yet been experimentally evaluate on multiple species and merit further research.
Original languageEnglish
Article number9809784
Number of pages10
JournalIEEE Transactions on Artificial Intelligence
Publication statusE-pub ahead of print - Jan 2022
MoE publication typeA1 Journal article-refereed


  • DNA
  • Kernel
  • Feature extraction
  • Training
  • Laplace equations
  • Learning systems
  • Support vector machines


Dive into the research topics of 'Identification of DNA N4-methylcytosine sites via multi-view kernel sparse representation model'. Together they form a unique fingerprint.

Cite this