Projects per year
Abstract
Identifying DNA N4-methylcytosine (4mC) sites is of great significance in biological research, such as chromatin structure, DNA stability, DNA-protein interaction and controlling gene expression. However, the traditional sequencing technology to identify 4mC sites is very time-consuming. In order to detect 4mC sites, we develop a multi-view learning method for achieving more effectively via merging multiple feature spaces. Furthermore, we think about whether the multi-view learning method can improve the across species classification ability by fusing data of multiple species. In our study, we propose a multi-view Laplacian kernel sparse representation-based classifier, called MvLapKSRC-HSIC. First, we make use of three feature extraction methods (PSTNP, NCP, DPP) to extract the DNA sequence features. MvLapKSRC-HSIC uses a kernel sparse representation-based classifier with graph regularization. In order to maintain the independence between various views, we add a multi-view regularization term constructed by Hilbert-Schmidt independence criterion (HSIC). In the experiments, MvLapKSRC-HSIC is applied on six datasets, so as to compare with other popular methods in single species and cross-species experiments. All experimental results show that MvLapKSRC-HSIC is superior to other outstanding methods on both single species and cross-species. Importantly, MvLapKSRC-HSIC can identify a series of potential DNA 4mC sites, which have not yet been experimentally evaluate on multiple species and merit further research.
Original language | English |
---|---|
Article number | 9809784 |
Pages (from-to) | 1236-1245 |
Number of pages | 10 |
Journal | IEEE Transactions on Artificial Intelligence |
Volume | 4 |
Issue number | 5 |
Early online date | Jan 2022 |
DOIs | |
Publication status | Published - Oct 2023 |
MoE publication type | A1 Journal article-refereed |
Keywords
- DNA
- Kernel
- Feature extraction
- Training
- Laplace equations
- Learning systems
- Support vector machines
Fingerprint
Dive into the research topics of 'Identification of DNA N4-methylcytosine sites via multi-view kernel sparse representation model'. Together they form a unique fingerprint.-
INTERVENE: International consortium for integrative genomics prediction
01/01/2021 → 31/12/2025
Project: EU: Framework programmes funding
-
DATALIT: Data Literacy for Responsible Decision-Making
Marttinen, P., Ji, S., Gröhn, T., Honkamaa, J., Kumar, Y., Pöllänen, A., Ojala, F., Raj, V. & Tiwari, P.
01/10/2020 → 30/09/2023
Project: Academy of Finland: Strategic research funding
-
eMOM: CleverHealth Network: eMOM GDM -Project
Marttinen, P., Alizadeh Ashrafi, R., Hizli, C. & Zhang, G.
05/02/2018 → 31/01/2023
Project: Business Finland: Other research funding