Projects per year
Abstract
DNA-binding proteins (DBPs) are of great significance in many basic cellular processes. Experiment-based methods for identifying DBPs are costly and time-consuming. To deal with large-scale DBP identification tasks, a variety of computation-based methods have been developed. Inspired by previous work, we propose a multiple Laplacian regularized support vector machine with local behavior similarity (MLapSVM-LBS) to predict DBP. We serially combine three features that are extracted from protein sequences (including PsePSSM, GE, NMBAC) and feed them into MLapSVM-LBS. Based on human behavior learning theory, MLapSVM-LBS can better represent the relationship between samples through local behavior similarity. We introduce a new edge weight calculation method that takes label information into consideration. In addition, a local distribution parameter reflecting the underlying probability distribution of a sample's neighborhood is also employed. To further improve the robustness of the model, we utilize multiple Laplacian regularization to build a multigraph model in which five Laplacian graphs are constructed with local behavior similarity by changing the neighborhood size. To appraise the performance of our model, MLapSVM-LBS is trained and tested on the PDB186, PDB1075, PDB2272 and PDB14189 datasets. On two independent testing sets (PDB186 and PDB2272), our method reaches the accuracies of 0.887 and 0.712, respectively. The good results on both datasets demonstrate the reliable performance of our model.
Original language | English |
---|---|
Article number | 109174 |
Pages (from-to) | 1-8 |
Number of pages | 8 |
Journal | KNOWLEDGE-BASED SYSTEMS |
Volume | 250 |
DOIs | |
Publication status | Published - 17 Aug 2022 |
MoE publication type | A1 Journal article-refereed |
Keywords
- DNA-binding proteins
- Laplacian support vector machine
- Multiple view
- Protein feature extraction
- Sequence classification
Fingerprint
Dive into the research topics of 'MLapSVM-LBS: Predicting DNA-binding proteins via a multiple Laplacian regularized support vector machine with local behavior similarity'. Together they form a unique fingerprint.-
INTERVENE: International consortium for integrative genomics prediction
01/01/2021 → 31/12/2025
Project: EU: Framework programmes funding
-
DATALIT: Data Literacy for Responsible Decision-Making
Marttinen, P., Ji, S., Gröhn, T., Honkamaa, J., Kumar, Y., Pöllänen, A., Ojala, F., Raj, V. & Tiwari, P.
01/10/2020 → 30/09/2023
Project: Academy of Finland: Strategic research funding
-
eMOM: CleverHealth Network: eMOM GDM -Project
Marttinen, P., Alizadeh Ashrafi, R., Hizli, C. & Zhang, G.
05/02/2018 → 31/01/2023
Project: Business Finland: Other research funding