Projekteja vuodessa
Abstrakti
Efficient reasoning about the spatial and temporal structure of the environment is crucial for perception in autonomous driving, particularly in an end-to-end approach. Although different sensor modalities are employed to capture the complex nature of the environment, they each have their limitations. For example, frame-based RGB cameras are susceptible to variations in illumination conditions. However, these limitations at the sensor level can be addressed by complementing them with sensor fusion techniques, enabling the learning of efficient feature representations for end-to-end autonomous perception. In this study, we address the end-to-end perception problem by fusing a frame-based RGB camera with an event camera to improve the learned representation for predicting lateral control. To achieve this, we propose a convolutional encoder–
decoder architecture called DRFuser. DRFuser encodes the features from both sensor modalities and leverages self-attention to fuse the frame-based RGB and event camera features in the encoder part. The decoder component unrolls the learned features to predict lateral control, specifically in the form of a steering angle. We extensively evaluate the proposed method on three datasets: our collected Dataset, Davis Driving dataset, and the EventScape dataset for simulation. The results demonstrate the generalization capability of our method
on both real-world and simulated datasets. We observe qualitative and quantitative improvements in the performance of the proposed method for predicting lateral control by incorporating the event camera in fusion with the frame-based RGB camera. Notably, our method outperforms state-of-the-art techniques on the Davis Driving Dataset, achieving a 5.6% improvement in the root mean square error (RMSE) score.
decoder architecture called DRFuser. DRFuser encodes the features from both sensor modalities and leverages self-attention to fuse the frame-based RGB and event camera features in the encoder part. The decoder component unrolls the learned features to predict lateral control, specifically in the form of a steering angle. We extensively evaluate the proposed method on three datasets: our collected Dataset, Davis Driving dataset, and the EventScape dataset for simulation. The results demonstrate the generalization capability of our method
on both real-world and simulated datasets. We observe qualitative and quantitative improvements in the performance of the proposed method for predicting lateral control by incorporating the event camera in fusion with the frame-based RGB camera. Notably, our method outperforms state-of-the-art techniques on the Davis Driving Dataset, achieving a 5.6% improvement in the root mean square error (RMSE) score.
Alkuperäiskieli | Englanti |
---|---|
Artikkeli | 107087 |
Sivumäärä | 16 |
Julkaisu | Engineering Applications of Artificial Intelligence |
Vuosikerta | 126 |
DOI - pysyväislinkit | |
Tila | Julkaistu - marrask. 2023 |
OKM-julkaisutyyppi | A1 Alkuperäisartikkeli tieteellisessä aikakauslehdessä |
Sormenjälki
Sukella tutkimusaiheisiin 'Multimodal fusion for sensorimotor control in steering angle prediction'. Ne muodostavat yhdessä ainutlaatuisen sormenjäljen.Projektit
- 1 Päättynyt
-
FCAI: Suomen tekoälykeskus
Kyrki, V., Azam, S. & Luck, K. S.
01/01/2022 → 31/12/2022
Projekti: Academy of Finland: Other research funding