Projects per year
Abstract
Proposal generation is a fundamental yet challenging task for two-stage temporal action detection pipelines. The task aims at predicting starting and ending boundaries of segments in realistic video sequences and action recognition methods cannot be directly applied to such videos due to their untrimmed nature. Most state-of-the-art models rely on temporal convolutional neural networks with pre-defined anchor segments. By eliminating anchors, we propose a lighter end-to-end trainable Anchor-Free Multiscale Transformer-based Generator (AMTG) model using local clues via video snippets. To improve effectiveness for temporal evaluation, we apply multiscale Transformer encoders to sequences with a bi-directional mask extension that simultaneously predicts boundary distances with uncertainties and various snippet-based local scores. Later, our model integrates local predictions to generate proposal candidates using the proposed scoring function. Experiments on the THUMOS14 and ActivityNet-1.3 benchmarks demonstrate the effectiveness of AMTG for the temporal proposal generation task.
Original language | English |
---|---|
Title of host publication | Proceedings - 2023 IEEE International Conference on Multimedia and Expo, ICME 2023 |
Publisher | IEEE |
Pages | 1853-1858 |
Number of pages | 6 |
ISBN (Electronic) | 978-1-6654-6891-6 |
DOIs | |
Publication status | Published - 2023 |
MoE publication type | A4 Conference publication |
Event | IEEE International Conference on Multimedia and Expo - Brisbane, Australia Duration: 10 Jul 2023 → 14 Jul 2023 |
Publication series
Name | Proceedings - IEEE International Conference on Multimedia and Expo |
---|---|
Publisher | IEEE |
Volume | 2023-July |
ISSN (Print) | 1945-7871 |
ISSN (Electronic) | 1945-788X |
Conference
Conference | IEEE International Conference on Multimedia and Expo |
---|---|
Abbreviated title | ICME |
Country/Territory | Australia |
City | Brisbane |
Period | 10/07/2023 → 14/07/2023 |
Keywords
- anchor-free
- multiscale transformer network
- temporal action proposals
- two-stage detectors
Fingerprint
Dive into the research topics of 'Anchor-Free Action Proposal Network with Uncertainty Estimation'. Together they form a unique fingerprint.Projects
- 2 Finished
-
USSEE: Understanding speech and scene with ears and eyes (USSEE)
Laaksonen, J. (Principal investigator)
01/01/2022 → 31/12/2024
Project: RCF Academy Project targeted call
-
-: Movie Making Finland: Finnish fiction films as audiovisual big data, 1907-2017
Laaksonen, J. (Principal investigator)
01/01/2020 → 31/12/2022
Project: Academy of Finland: Other research funding