Replication data for: Reconciliation k-median: Clustering with non-polarized representatives

Dataset

Description

# Description
These files contain the data employed in the experiments described in Bruno Ordozgoiti and Aristides Gionis. 2019. Reconciliation k-median: Clustering with Non-Polarized Representatives. In Proceedings of the 2019 World Wide Web Conference (WWW’19), May 13–17, 2019, San Francisco, CA, USA.

Twitter ID's have been anonymized.

# Contents
domain_mentions.txt: Each line contains a domain name, a user ID and the number of times this user has mentioned this domain name in a tweet.
format: domain_name <TAB> user_id <TAB> mention_count

domains_ideology_score.txt: Domain names and their ideology score, estimated as described in (Lahoti et al. WSDM 2018). Note: missing scores can be retrieved from supplementary data in https://doi.org/10.1093/poq/nfw006
format: domain_name <TAB> ideology_score

follow_graph.txt: The Twitter follower graph. Each line contains a user id and the user id of one of its followers.
format: user_id <TAB> follower_user_id

representatives.txt: US Congress representatives, each with Twitter handle and polarity score computed using Barbera's method (Barbera, 2015).
format: rep_name <TAB> website_url <TAB> district <TAB> twitter_handle <TAB> party <TAB> barbera_polarity_score

user_polarity.txt: User ID's and polarity score computed using Barbera's method (Barbera, 2015).
format: user_id <TAB> barbera_polarity_score
Date made available2019

Cite this