Abstract
Automatically generated fake restaurant reviews are a threat to online review systems. Recent research has shown that users have difficulties in detecting machine-generated fake reviews hiding among real restaurant reviews. The method used in this work (char-LSTM ) has one drawback: it has difficulties staying in context, i.e. when it generates a review for specific target entity, the resulting review may contain phrases that are unrelated to the target, thus increasing its detectability. In this work, we present and evaluate a more sophisticated technique based on neural machine translation (NMT) with which we can generate reviews that stay on-topic. We test multiple variants of our technique using native English speakers on Amazon Mechanical Turk. We demonstrate that reviews generated by the best variant have almost optimal undetectability (class-averaged F-score 47%). We conduct a user study with skeptical users and show that our method evades detection more frequently compared to the state-of-the-art (average evasion 3.2/4 vs 1.5/4) with statistical significance, at level {\alpha} = 1% (Section 4.3). We develop very effective detection tools and reach average F-score of 97% in classifying these. Although fake reviews are very effective in fooling people, effective automatic detection is still feasible.
Original language | English |
---|---|
Title of host publication | Computer Security - 23rd European Symposium on Research in Computer Security, ESORICS 2018, Proceedings |
Publisher | Springer |
Pages | 132-151 |
Number of pages | 20 |
ISBN (Print) | 9783319990729 |
DOIs | |
Publication status | Published - 2018 |
MoE publication type | A4 Conference publication |
Event | European Symposium on Research in Computer Security - Barcelona, Spain Duration: 3 Sept 2018 → 7 Sept 2018 Conference number: 23 https://esorics2018.upc.edu/program.do https://esorics2018.upc.edu/ |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Publisher | Springer |
Volume | 11098 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | European Symposium on Research in Computer Security |
---|---|
Abbreviated title | ESORICS |
Country/Territory | Spain |
City | Barcelona |
Period | 03/09/2018 → 07/09/2018 |
Internet address |
Keywords
- security
- Machine Learning
- fraud detection
- neural machine translation
- social media
- natural language