These files contain the data employed in the experiments described in Bruno Ordozgoiti and Aristides Gionis. 2019. Reconciliation k-median: Clustering with Non-Polarized Representatives. In Proceedings of the 2019 World Wide Web Conference (WWW’19), May 13–17, 2019, San Francisco, CA, USA.
Twitter ID's have been anonymized.
domain_mentions.txt: Each line contains a domain name, a user ID and the number of times this user has mentioned this domain name in a tweet.
format: domain_name <TAB> user_id <TAB> mention_count
domains_ideology_score.txt: Domain names and their ideology score, estimated as described in (Lahoti et al. WSDM 2018). Note: missing scores can be retrieved from supplementary data in https://doi.org/10.1093/poq/nfw006
format: domain_name <TAB> ideology_score
follow_graph.txt: The Twitter follower graph. Each line contains a user id and the user id of one of its followers.
format: user_id <TAB> follower_user_id
representatives.txt: US Congress representatives, each with Twitter handle and polarity score computed using Barbera's method (Barbera, 2015).
format: rep_name <TAB> website_url <TAB> district <TAB> twitter_handle <TAB> party <TAB> barbera_polarity_score
user_polarity.txt: User ID's and polarity score computed using Barbera's method (Barbera, 2015).
format: user_id <TAB> barbera_polarity_score