Projects per year
Abstract
We introduce eCLIP, an enhanced version of the CLIP model that integrates expert annotations in the form of radiologist eye-gaze heatmaps. It tackles key challenges in contrastive multi-modal medical imaging analysis, notably data scarcity and the "modality gap" -- a significant disparity between image and text embeddings that diminishes the quality of representations and hampers cross-modal interoperability. eCLIP integrates a heatmap processor and leverages mixup augmentation to efficiently utilize the scarce expert annotations, thus boosting the model's learning effectiveness. eCLIP is designed to be generally applicable to any variant of CLIP without requiring any modifications of the core architecture. Through detailed evaluations across several tasks, including zero-shot inference, linear probing, cross-modal retrieval, and Retrieval Augmented Generation (RAG) of radiology reports using a frozen Large Language Model, eCLIP showcases consistent improvements in embedding quality. The outcomes reveal enhanced alignment and uniformity, affirming eCLIP's capability to harness high-quality annotations for enriched multi-modal analysis in the medical imaging domain.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2024 |
Subtitle of host publication | 18th European Conference, Milan, Italy, September 29–October 4, 2024, Proceedings, Part XX |
Editors | Aleš Leonardis, Elisa Ricci, Stefan Roth, Olga Russakovsky, Torsten Sattler, Gül Varol |
Publisher | Springer |
Pages | 468-486 |
ISBN (Electronic) | 978-3-031-72661-3 |
ISBN (Print) | 978-3-031-72660-6 |
DOIs | |
Publication status | Published - 2025 |
MoE publication type | A4 Conference publication |
Event | European Conference on Computer Vision - Milano, Italy Duration: 29 Sept 2024 → 4 Oct 2024 Conference number: 18 |
Publication series
Name | Lecture Notes in Computer Science |
---|---|
Publisher | Springer |
Volume | 15078 |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Conference on Computer Vision |
---|---|
Abbreviated title | ECCV |
Country/Territory | Italy |
City | Milano |
Period | 29/09/2024 → 04/10/2024 |
Keywords
- Contrastive Learning
- Deep Neural Networks
- LLM Large Language Models
- Medical Imaging
- Zero-shot Inference
Fingerprint
Dive into the research topics of 'Improving Medical Multi-modal Contrastive Learning with Expert Annotations'. Together they form a unique fingerprint.-
CLISHEAT/Marttinen: Green and digital healthcare
Marttinen, P. (Principal investigator)
EU The Recovery and Resilience Facility (RRF)
01/01/2023 → 31/12/2025
Project: RCF Academy Project targeted call
-
INTERVENE: International consortium for integrative genomics prediction
Kaski, S. (Principal investigator)
01/01/2021 → 31/12/2025
Project: EU H2020 Framework program
-
-: Finnish Center for Artificial Intelligence
Kaski, S. (Principal investigator)
01/01/2019 → 31/12/2022
Project: Academy of Finland: Other research funding