No photo of Rao Anwer
20142025

Research activity per year

Filter
Conference article in proceedings

Search results

  • 2025

    All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages

    Vayani, A., Dissanayake, D., Watawana, H., Ahsan, N., Sasikumar, N., Thawakar, O., Ademtew, H. B., Hmaiti, Y., Kumar, A., Kuckreja, K., Maslych, M., Ghallabi, W. A., Qin, C., Shaker, A. M., Zhang, M., Ihsani, M. K., Esplana, A., Gokani, M., Mirkin, S. & Singh, H. & 47 others, Srivastava, A., Hamerlik, E., Izzati, F. A., Maani, F. A., Cavada, S., Chim, J., Gupta, R., Manjunath, S., Zhumakhanova, K., Rabevohitra, F. H., Amirudin, A., Ridzuan, M., Kareem, D., More, K., Li, K., Shakya, P., Saad, M., Ghasemaghaei, A., Djanibekov, A., Azizov, D., Jankovic, B., Bhatia, N., Obando-Ceron, J., Otieno, O., Farestam, F., Rabbani, M., Baliah, S., Sanjeev, S., Shtanchaev, A., Fatima, M., Nguyen, T., Kareem, A., Aremu, T., Xavier, N., Bhatkal, A., Toyin, H., Chadha, A., Cholakkal, H., Anwer, R. M., Felsberg, M., Laaksonen, J., Solorio, T., Choudhury, M., Laptev, I., Shah, M., Khan, S. & Khan, F., 10 Jun 2025, (Accepted/In press) The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025. United States: IEEE, 25 p.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
  • Continual Learning and Unknown Object Discovery in 3D Scenes via Self-distillation

    Boudjoghra, M. E. A., Lahoud, J., Cholakkal, H., Anwer, R. M., Khan, S. & Khan, F. S., 2025, Computer Vision – ECCV 2024 - 18th European Conference, Proceedings. Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T. & Varol, G. (eds.). Springer, p. 416-431 16 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 15131 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
  • Palo: A Polyglot Large Multimodal Model for 5B People

    Rasheed, H., Maaz, M., Shaker, A., Khan, S., Cholakal, H., Anwer, R. M., Baldwin, T., Felsberg, M. & Khan, F. S., 2025, Proceedings - 2025 IEEE Winter Conference on Applications of Computer Vision, WACV 2025. IEEE, p. 1745-1754 10 p. ( IEEE Workshop on Applications of Computer Vision).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
  • 2024

    3D Indoor Instance Segmentation in an Open-World

    El Amine Boudjoghra, M., Al Khatib, S. K., Lahoud, J., Cholakkal, H., Anwer, R. M., Khan, S. & Khan, F. S., 2024, Advances in Neural Information Processing Systems 36 (NeurIPS 2023). Curran Associates Inc., 21 p. (Advances in Neural Information Processing Systems; vol. 36).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
  • Composed Video Retrieval via Enriched Context and Discriminative Embeddings

    Thawakar, O., Naseer, M., Anwer, R. M., Khan, S., Felsberg, M., Shah, M. & Khan, F. S., 2024, Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024. IEEE, p. 26886-26896 11 p. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    2 Citations (Scopus)
  • GLaMM: Pixel Grounding Large Multimodal Model

    Rasheed, H., Maaz, M., Shaji, S., Shaker, A., Khan, S., Cholakkal, H., Anwer, R. M., Xing, E., Yang, M. H. & Khan, F. S., 2024, Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024. IEEE, p. 13009-13018 10 p. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    36 Citations (Scopus)
  • Long-Tailed 3D Semantic Segmentation with Adaptive Weight Constraint and Sampling

    Lahoud, J., Khan, F. S., Cholakkal, H., Anwer, R. M. & Khan, S., 2024, 2024 IEEE International Conference on Robotics and Automation, ICRA 2024. IEEE, p. 5037-5044 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

  • XrayGPT: Chest Radiographs Summarization using Large Medical Vision-Language Models

    Thawakar, O., Shaker, A., Mullappilly, S. S., Cholakkal, H., Anwer, R. M., Khan, S., Laaksonen, J. & Khan, F. S., 2024, BioNLP 2024 - 23rd Meeting of the ACL Special Interest Group on Biomedical Natural Language Processing, Proceedings of the Workshop and Shared Tasks. Demner-Fushman, D., Ananiadou, S., Miwa, M., Roberts, K. & Tsujii, J. (eds.). Association for Computational Linguistics, p. 440-448 9 p.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    14 Citations (Scopus)
  • 2023

    3D Mitochondria Instance Segmentation with Spatio-Temporal Transformers

    Thawakar, O., Anwer, R. M., Laaksonen, J., Reiner, O., Shah, M. & Khan, F. S., 2023, Medical Image Computing and Computer Assisted Intervention – MICCAI 2023: Proceedings of 26th International Conference. Greenspan, H., Greenspan, H., Madabhushi, A., Mousavi, P., Salcudean, S., Duncan, J., Syeda-Mahmood, T. & Taylor, R. (eds.). Springer, p. 613-623 11 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 14227 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    File
    2 Citations (Scopus)
    8 Downloads (Pure)
  • Cross-Modulated Few-Shot Image Generation for Colorectal Tissue Classification

    Kumar, A., Bhunia, A. K., Narayan, S., Cholakkal, H., Anwer, R. M., Laaksonen, J. & Khan, F. S., 2023, Medical Image Computing and Computer Assisted Intervention – MICCAI 2023: Proceedings of 26th International Conference. Greenspan, H., Greenspan, H., Madabhushi, A., Mousavi, P., Salcudean, S., Duncan, J., Syeda-Mahmood, T. & Taylor, R. (eds.). Springer, p. 128-137 10 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 14222 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    3 Citations (Scopus)
  • EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications

    Maaz, M., Shaker, A., Cholakkal, H., Khan, S., Zamir, S. W., Anwer, R. M. & Shahbaz Khan, F., 2023, Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII. Karlinsky, L., Michaeli, T. & Nishino, K. (eds.). Springer, p. 3-20 18 p. (Lecture Notes in Computer Science ; vol. 13807 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    129 Citations (Scopus)
  • Generative Multiplane Neural Radiance for 3D-Aware Image Generation

    Kumar, A., Bhunia, A. K., Narayan, S., Cholakkal, H., Anwer, R. M., Khan, S., Yang, M.-H. & Khan, F. S., 2023, 2023 IEEE/CVF International Conference on Computer Vision (ICCV). IEEE, (IEEE International Conference on Computer Vision).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
  • Person Image Synthesis via Denoising Diffusion Model

    Kumar Bhunia, A., Khan, S., Cholakkal, H., Anwer, R. M., Laaksonen, J., Shah, M. & Khan, F. S., 2023, Proceedings - 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023. IEEE, p. 5968-5976 9 p. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; vol. 2023-June).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    104 Citations (Scopus)
  • 2022

    Class-Agnostic Object Detection with Multi-modal Transformer

    Maaz, M., Rasheed, H., Khan, S., Khan, F. S., Anwer, R. M. & Yang, M. H., 2022, Computer Vision – ECCV 2022 - 17th European Conference, Proceedings. Avidan, S., Brostow, G., Cissé, M., Farinella, G. M. & Hassner, T. (eds.). Springer, p. 512-531 20 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 13670 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    53 Citations (Scopus)
  • DoodleFormer: Creative Sketch Drawing with Transformers

    Bhunia, A. K., Khan, S., Cholakkal, H., Anwer, R. M., Khan, F. S., Laaksonen, J. & Felsberg, M., 2022, Computer Vision – ECCV 2022 - 17th European Conference, Proceedings. Avidan, S., Brostow, G., Cissé, M., Farinella, G. M. & Hassner, T. (eds.). Springer, p. 338-355 18 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 13677 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    13 Citations (Scopus)
  • Energy-based Latent Aligner for Incremental Learning

    Joseph, K. J., Khan, S., Khan, F. S., Anwer, R. M. & Balasubramanian, V. N., 2022, 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, p. 7442-7451 10 p. (IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    32 Citations (Scopus)
  • Learning a Dynamic Cross-Modal Network for Multispectral Pedestrian Detection

    Xie, J., Anwer, R., Cholakkal, H., Nie, J., Cao, J., Laaksonen, J. & Khan, F. S., 10 Oct 2022, MM '22: Proceedings of the 30th ACM International Conference on Multimedia. ACM, p. 4043-4052

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

  • Spatio-temporal Relation Modeling for Few-shot Action Recognition

    Thatipelli, A., Narayan, S., Khan, S., Anwer, R. M., Khan, F. S. & Ghanem, B., 2022, Proceedings - 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022. IEEE, p. 19926-19935 10 p. (IEEE Computer Society Conference on Computer Vision and Pattern Recognition).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    98 Citations (Scopus)
  • 2020

    Deep Contextual Attention for Human-Object Interaction Detection

    Wang, T., Anwer, R. M., Khan, M. H., Khan, F. S., Pang, Y., Shao, L. & Laaksonen, J., Feb 2020, Proceedings of the International Conference on Computer Vision (ICCV2019). IEEE, p. 5693-5701 9 p. 9008846. (Proceedings of the IEEE International Conference on Computer Vision; vol. 2019-October).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    File
    118 Citations (Scopus)
    241 Downloads (Pure)
  • 2019

    Multi-stream Convolutional Networks for Indoor Scene Recognition

    Anwer, R. M., Khan, F. S., Laaksonen, J. & Zaki, N., 1 Jan 2019, Computer Analysis of Images and Patterns - 18th International Conference, CAIP 2019, Proceedings. Vento, M. & Percannella, G. (eds.). Springer, p. 196-208 13 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 11678 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    1 Citation (Scopus)
  • 2018

    Bottom-Up Attention Guidance for Recurrent Image Recognition

    Rezazadegan Tavakoli, H., Borji, A., Anwer, R., Rahtu, E. & Kannala, J., 2018, 2018 IEEE International Conference on Image Processing, ICIP 2018 - Proceedings. IEEE, p. 3004-3008 5 p. 8451537

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    4 Citations (Scopus)
  • Two-stream part-based deep representation for human attribute recognition

    Anwer, R. M., Khan, F. S. & Laaksonen, J., 13 Jul 2018, Proceedings - 2018 International Conference on Biometrics, ICB 2018. IEEE, p. 90-97 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    Open Access
    File
    1 Citation (Scopus)
    260 Downloads (Pure)
  • 2017

    TEX-Nets: Binary patterns encoded convolutional neural networks for texture recognition

    Anwer, R. M., Khan, F. S., van de Weijer, J. & Laaksonen, J., 6 Jun 2017, ICMR 2017 - Proceedings of the 2017 ACM International Conference on Multimedia Retrieval. ACM, p. 125-132 8 p.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    10 Citations (Scopus)
  • Top-down deep appearance attention for action recognition

    Anwer, R. M., Khan, F. S., van de Weijer, J. & Laaksonen, J., 2017, Image Analysis - 20th Scandinavian Conference, SCIA 2017, Proceedings. Springer, p. 297-309 13 p. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); vol. 10269 LNCS).

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    1 Citation (Scopus)
  • 2016

    Combining holistic and part-based deep representations for computational painting categorization

    Anwer, R. M., Khan, F. S., Van De Weijer, J. & Laaksonen, J., 6 Jun 2016, ICMR 2016 - Proceedings of the 2016 ACM International Conference on Multimedia Retrieval. ACM, p. 339-342 4 p.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    26 Citations (Scopus)
  • 2015

    Deep Semantic Pyramids for Human Attributes and Action Recognition

    Khan, F. S., Anwer, R. M., van de Weijer, J., Felsberg, M. & Laaksonen, J., 2015, 19th Scandinavian Conference on Image Analysis (SCIA 2015), Copenhagen, Denmark, June 2015. Paulsen, R. R. & Pedersen, K. S. (eds.). Switzerland: Springer, p. 341-353

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    8 Citations (Scopus)
  • PicSOM Experiments in TRECVID 2015

    Ishikawa, S., Anwer, R. M., Koskela, M. & Laaksonen, J., 2015, TRECVID 2015 Workshop. Gaithersburg, MD, USA: National Institute of Standards and Technology (NIST)

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsProfessional

    2 Citations (Scopus)
  • 2014

    PicSOM Experiments in TRECVID 2014

    Ishikawa, S., Koskela, M., Sjöberg, M., Anwer, R. M., Laaksonen, J. & Oja, E., 2014, TRECVID workshop, TRECVID, Orlando, USA, November 10-12 2014.

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsProfessional

Your message has successfully been sent.
Your message was not sent due to an error.