Understanding the demographics of app users is crucial, for example, for app developers, who wish to target their advertisements more effectively. Our work addresses this need by studying the predictability of user demographics based on the list of a user's apps which is readily available to many app developers. We extend previous work on the problem on three frontiers: (1) We predict new demographics (age, race, and income) and analyze the most informative apps for four demographic attributes included in our analysis. The most predictable attribute is gender (82.3 % accuracy), whereas the hardest to predict is income (60.3 % accuracy). (2)We compare several dimensionality reduction methods for high-dimensional app data, finding out that an unsupervised method yields superior results compared to aggregating the apps at the app category level, but the best results are obtained simply by the raw list of apps. (3) We look into the effect of the training set size and the number of apps on the predictability and show that both of these factors have a large impact on the prediction accuracy. The predictability increases, or in other words, a user's privacy decreases, the more apps the user has used, but somewhat surprisingly, after 100 apps, the prediction accuracy starts to decrease.
|Otsikko||Proceedings of the 10th International Conference on Web and Social Media, ICWSM 2016|
|Tila||Julkaistu - 2016|
|OKM-julkaisutyyppi||A4 Artikkeli konferenssijulkaisuussa|
|Tapahtuma||International AAAI Conference on Web and Social Media - Cologne, Saksa|
Kesto: 17 toukokuuta 2016 → 20 toukokuuta 2016
|Conference||International AAAI Conference on Web and Social Media|
|Ajanjakso||17/05/2016 → 20/05/2016|