TY - JOUR
T1 - Detecting country of residence from social media data : a comparison of methods
AU - Heikinheimo, V.
AU - Järv, O.
AU - Tenkanen, H.
AU - Hiippala, T.
AU - Toivonen, T.
N1 - Funding Information:
All authors would like to thank the Kone Foundation for supporting the Social Media Data for Conservation Science – project (grant number 86878) for supporting the research and South African National Parks for the visitor data. O.J. thanks the Kone Foundation (grant number 201608739) and the Academy of Finland (grant number 331549) for support. All authors also thank Emil Ehnström for participating in the data analysis.
Publisher Copyright:
© 2022 Informa UK Limited, trading as Taylor & Francis Group.
PY - 2022/3/7
Y1 - 2022/3/7
N2 - Identifying users’ place of residence is an important step in many social media analysis workflows. Various techniques for detecting home locations from social media data have been proposed, but their reliability has rarely been validated using ground truth data. In this article, we compared commonly used spatial and Spatio-temporal methods to determine social media users’ country of residence. We applied diverse methods to a global data set of publicly shared geo-located Instagram posts from visitors to the Kruger National Park in South Africa. We evaluated the performance of each method using both individual-level expert assessment for a sample of users and aggregate-level official visitor statistics. Based on the individual-level assessment, a simple Spatio-temporal approach was the best-performed for detecting the country of residence. Results show why aggregate-level official statistics are not the best indicators for evaluating method performance. We also show how social media usage, such as the number of countries visited and posting activity over time, affect the performance of methods. In addition to a methodological contribution, this work contributes to the discussion about spatial and temporal biases in mobile big data.
AB - Identifying users’ place of residence is an important step in many social media analysis workflows. Various techniques for detecting home locations from social media data have been proposed, but their reliability has rarely been validated using ground truth data. In this article, we compared commonly used spatial and Spatio-temporal methods to determine social media users’ country of residence. We applied diverse methods to a global data set of publicly shared geo-located Instagram posts from visitors to the Kruger National Park in South Africa. We evaluated the performance of each method using both individual-level expert assessment for a sample of users and aggregate-level official visitor statistics. Based on the individual-level assessment, a simple Spatio-temporal approach was the best-performed for detecting the country of residence. Results show why aggregate-level official statistics are not the best indicators for evaluating method performance. We also show how social media usage, such as the number of countries visited and posting activity over time, affect the performance of methods. In addition to a methodological contribution, this work contributes to the discussion about spatial and temporal biases in mobile big data.
KW - home location
KW - human mobility
KW - Social media
KW - Spatio-temporal analysis
KW - tourism
UR - http://www.scopus.com/inward/record.url?scp=85126124470&partnerID=8YFLogxK
U2 - 10.1080/13658816.2022.2044484
DO - 10.1080/13658816.2022.2044484
M3 - Article
AN - SCOPUS:85126124470
JO - International Journal of Geographical Information Science
JF - International Journal of Geographical Information Science
SN - 1365-8816
ER -