Revisiting the Challenges and Opportunities in Software Plagiarism Detection

Xi Xu, Ming Fan, Ang Jia, Yin Wang, Zheng Yan, Qinghua Zheng, Ting Liu

    Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingsScientificpeer-review

    274 Downloads (Pure)

    Abstract

    Software plagiarism seriously impedes the healthy development of open source software. To fight against code obfuscation and inherent non-determinism of thread scheduling applied against software plagiarism detection, we proposed a new dynamic birthmark called DYnamic Key Instruction Sequence (DYKIS) and a framework called Thread-oblivious dynamic Birthmark (TOB) for the purpose of reviving the existing birthmarks and a thread-aware dynamic birthmark called Thread-related System call Birthmark (TreSB). Though many approaches have been proposed for software plagiarism detection, they are still limited to satisfy the following highly desired requirements: the applicability to handle binary, the capability to detect partial plagiarism, the resiliency to code obfuscation, the interpretability on detection results, and the scalability to process large-scale software. In this position paper, we discuss and outline the research opportunities and challenges in the field of software plagiarism detection in order to stimulate brilliant innovations and direct our future research efforts.

    Original languageEnglish
    Title of host publicationSANER 2020 - Proceedings of the 2020 IEEE 27th International Conference on Software Analysis, Evolution, and Reengineering
    EditorsKostas Kontogiannis, Foutse Khomh, Alexander Chatzigeorgiou, Marios-Eleftherios Fokaefs, Minghui Zhou
    PublisherIEEE
    Pages537-541
    Number of pages5
    ISBN (Electronic)9781728151434
    DOIs
    Publication statusPublished - Feb 2020
    MoE publication typeA4 Conference publication
    EventIEEE International Conference on Software Analysis, Evolution, and Reengineering - London, Canada
    Duration: 18 Feb 202021 Feb 2020
    Conference number: 27

    Conference

    ConferenceIEEE International Conference on Software Analysis, Evolution, and Reengineering
    Abbreviated titleSANER
    Country/TerritoryCanada
    CityLondon
    Period18/02/202021/02/2020

    Keywords

    • binary code similarity
    • software birthmark
    • software plagiarism detection
    • source code similarity

    Fingerprint

    Dive into the research topics of 'Revisiting the Challenges and Opportunities in Software Plagiarism Detection'. Together they form a unique fingerprint.

    Cite this