Abstract
Software plagiarism seriously impedes the healthy development of open source software. To fight against code obfuscation and inherent non-determinism of thread scheduling applied against software plagiarism detection, we proposed a new dynamic birthmark called DYnamic Key Instruction Sequence (DYKIS) and a framework called Thread-oblivious dynamic Birthmark (TOB) for the purpose of reviving the existing birthmarks and a thread-aware dynamic birthmark called Thread-related System call Birthmark (TreSB). Though many approaches have been proposed for software plagiarism detection, they are still limited to satisfy the following highly desired requirements: the applicability to handle binary, the capability to detect partial plagiarism, the resiliency to code obfuscation, the interpretability on detection results, and the scalability to process large-scale software. In this position paper, we discuss and outline the research opportunities and challenges in the field of software plagiarism detection in order to stimulate brilliant innovations and direct our future research efforts.
Original language | English |
---|---|
Title of host publication | SANER 2020 - Proceedings of the 2020 IEEE 27th International Conference on Software Analysis, Evolution, and Reengineering |
Editors | Kostas Kontogiannis, Foutse Khomh, Alexander Chatzigeorgiou, Marios-Eleftherios Fokaefs, Minghui Zhou |
Publisher | IEEE |
Pages | 537-541 |
Number of pages | 5 |
ISBN (Electronic) | 9781728151434 |
DOIs | |
Publication status | Published - Feb 2020 |
MoE publication type | A4 Conference publication |
Event | IEEE International Conference on Software Analysis, Evolution, and Reengineering - London, Canada Duration: 18 Feb 2020 → 21 Feb 2020 Conference number: 27 |
Conference
Conference | IEEE International Conference on Software Analysis, Evolution, and Reengineering |
---|---|
Abbreviated title | SANER |
Country/Territory | Canada |
City | London |
Period | 18/02/2020 → 21/02/2020 |
Keywords
- binary code similarity
- software birthmark
- software plagiarism detection
- source code similarity