Abstract
Lately, cross-modal retrieval has attained plenty of attention due to enormous multi-modal data generation every day in the form of audio, video, image, and text. One vital requirement of cross-modal retrieval is to reduce the heterogeneity gap among various modalities so that one modality's results can be efficiently retrieved from the other. So, a novel unsupervised cross-modal retrieval framework based on associative learning has been proposed in this paper where two traditional SOMs are trained separately for images and collateral text and then they are associated together using the Hebbian learning network to facilitate the cross-modal retrieval process. Experimental outcomes on a popular Wikipedia dataset demonstrate that the presented technique outshines various existing state-of-the-art approaches.
Original language | English |
---|---|
Article number | 108014 |
Pages (from-to) | 1-18 |
Number of pages | 18 |
Journal | Knowledge-Based Systems |
Volume | 239 |
DOIs | |
Publication status | Published - 5 Mar 2022 |
MoE publication type | A1 Journal article-refereed |
Keywords
- Cross-modal retrieval
- Hebbian learning
- Machine learning
- Self organizing maps
- Zernike moments