Abstract
Parkinson's disease (PD) genes identification plays an important role in improving the diagnosis and treatment of the disease. A number of machine learning methods have been proposed to identify disease-related genes, but only few of these methods are adopted for PD. This work puts forth a novel neural network-based ensemble (n-semble) method to identify Parkinson's disease genes. The artificial neural network is trained in a unique way to ensemble the multiple model predictions. The proposed n-semble method is composed of four parts: (1) protein sequences are used to construct feature vectors using physicochemical properties of amino acid; (2) dimensionality reduction is achieved using the t-Distributed Stochastic Neighbor Embedding (t-SNE) method, (3) the Jaccard method is applied to find likely negative samples from unknown (candidate) genes, and (4) gene prediction is performed with n-semble method. The proposed n-semble method has been compared with Smalter's, ProDiGe, PUDI and EPU methods using various evaluation metrics. It has been concluded that the proposed n-semble method outperforms the existing gene identification methods over the other methods and achieves significantly higher precision, recall and F Score of 88.9%, 90.9% and 89.8%, respectively. The obtained results confirm the effectiveness and validity of the proposed framework.
Original language | English |
---|---|
Pages (from-to) | 23829–23839 |
Number of pages | 11 |
Journal | Neural Computing & Applications |
Volume | 35 |
Issue number | 33 |
Early online date | 24 Apr 2021 |
DOIs | |
Publication status | Published - Nov 2023 |
MoE publication type | A1 Journal article-refereed |
Keywords
- Parkinson’
- s disease
- Machine learning methods
- Healthcare
- Physicochemical properties of amino acid
- Neural networks
- PROTEIN-PROTEIN INTERACTIONS
- TOPOLOGICAL FEATURES
- NEURAL-NETWORK
- PREDICTION
- IDENTIFICATION
- AUTOCORRELATION
- CLASSIFICATION
- SIMILARITY
- SURFACE