Big data open up unprecedented opportunities for investigating complex systems, including society. In particular, communication data serve as major sources for computational social sciences, but they have to be cleaned and filtered as they may contain spurious information due to recording errors as well as interactions, like commercial and marketing activities, not directly related to the social network. The network constructed from communication data can only be considered as a proxy for the network of social relationships. Here we apply a systematic method, based on multiple-hypothesis testing, to statistically validate the links and then construct the corresponding Bonferroni network, generalized to the directed case. We study two large datasets of mobile phone records, one from Europe and the other from China. For both datasets we compare the raw data networks with the corresponding Bonferroni networks and point out significant differences in the structures and in the basic network measures. We show evidence that the Bonferroni network provides a better proxy for the network of social interactions than the original one. Using the filtered networks, we investigated the statistics and temporal evolution of small directed 3-motifs and concluded that closed communication triads have a formation time scale, which is quite fast and typically intraday. We also find that open communication triads preferentially evolve into other open triads with a higher fraction of reciprocated calls. These stylized facts were observed for both datasets.
- complex networks
- mobile call records
- social systems
- statistically validated networks