A1 Refereed original research article in a scientific journal
Precision as a measure of predictability of missing links in real networks
Authors: Guillermo García-Pérez, Roya Aliakbarisani, Abdorasoul Ghasemi, M. Angeles Serrano
Publisher: AMER PHYSICAL SOC
Publication year: 2020
Journal: Physical review E
Journal name in source: PHYSICAL REVIEW E
Journal acronym: PHYS REV E
Article number: ARTN 052318
Volume: 101
Issue: 5
Number of pages: 11
ISSN: 1539-3755
eISSN: 2470-0053
DOI: https://doi.org/10.1103/PhysRevE.101.052318
Abstract
Predicting missing links in real networks is an important open problem in network science to which considerable efforts have been devoted, giving as a result a vast plethora of link prediction methods in the literature. In this work, we take a different point of view on the problem and focus on predictability instead of prediction. By considering ensembles defined by well-known network models, we prove analytically that even the best possible link prediction method, given by the ensemble connection probabilities, yields a limited precision that depends quantitatively on the topological properties-such as degree heterogeneity, clustering, and community structure-of the ensemble. This suggests an absolute limitation to the predictability of missing links in real networks, due to the irreducible uncertainty arising from the random nature of link formation processes. We show that a predictability limit can be estimated in real networks, and we propose a method to approximate such a bound from real-world networks with missing links. The predictability limit gives a benchmark to gauge the quality of link prediction methods in real networks.
Predicting missing links in real networks is an important open problem in network science to which considerable efforts have been devoted, giving as a result a vast plethora of link prediction methods in the literature. In this work, we take a different point of view on the problem and focus on predictability instead of prediction. By considering ensembles defined by well-known network models, we prove analytically that even the best possible link prediction method, given by the ensemble connection probabilities, yields a limited precision that depends quantitatively on the topological properties-such as degree heterogeneity, clustering, and community structure-of the ensemble. This suggests an absolute limitation to the predictability of missing links in real networks, due to the irreducible uncertainty arising from the random nature of link formation processes. We show that a predictability limit can be estimated in real networks, and we propose a method to approximate such a bound from real-world networks with missing links. The predictability limit gives a benchmark to gauge the quality of link prediction methods in real networks.