Redundancy Removing Aggregation Network with Distance Calibration for Video Face Recognition

Zhonghong Ou, Yucheng Hu, Meina Song, Zheng Yan, Hui Pan

    Research output: Contribution to journalArticleScientificpeer-review

    10 Citations (Scopus)
    210 Downloads (Pure)

    Abstract

    Attention-based techniques have been successfully used for rating image quality, and have been widely employed for set-based face recognition. Nevertheless, for video face recognition, where the base convolutional neural network (CNN) trained on large-scale data already provides discriminative features, fusing features with only predicted quality scores to generate representation are likely to cause duplicate sample dominant problem, and degrade performance correspondingly. To resolve the problem mentioned above, we propose a redundancy removing aggregation network (RRAN) for video face recognition. Compared with other quality-aware aggregation schemes, RRAN can take advantage of similarity information to tackle the noise introduced by redundant video frames. By leveraging metric learning, RRAN introduces a distance calibration scheme to align distance distributions of negative pairs of different video representations, which improves the accuracy under a uniform threshold. A series of experiments is conducted on multiple realistic data sets to evaluate the performance of RRAN, including YouTube Faces, IJB-A, and IJB-C. In comprehensive experiments, we demonstrate that our method can diminish the overall influence of poor quality components with large proportion in the video and further improve the overall recognition performance with individual difference. Specifically, RRAN achieves a 96.84% accuracy on YouTube Face, outperforming all existing aggregation schemes.

    Original languageEnglish
    Pages (from-to)7279-7287
    Number of pages9
    JournalIEEE Internet of Things Journal
    Volume8
    Issue number9
    Early online date2020
    DOIs
    Publication statusPublished - 1 May 2021
    MoE publication typeA1 Journal article-refereed

    Keywords

    • Aggregates
    • Calibration
    • Computer architecture
    • Convolutional Neural Networks
    • Face recognition
    • Feature Aggregation
    • Feature extraction
    • Metric Learning
    • Redundancy
    • Training
    • Video-based Face Recognition.

    Fingerprint

    Dive into the research topics of 'Redundancy Removing Aggregation Network with Distance Calibration for Video Face Recognition'. Together they form a unique fingerprint.

    Cite this