Skip to main content

Edge Formation and Its Influence on Machine Learning

Online Event
Lisette Espín-Noboa
Monday, January 18, 2021, 2:00 pm – 3:00 pm

Please register for this event under the link provided on the right. We will send the link to registered attendees 1 hour before the event starts.

ABSTRACT / Social networks are ubiquitous structures that we generate and enrich everyday while connecting with people through social media platforms, emails, and any other type of interaction. While these structures are intangible to us, they are very important carriers of information. For instance, the political leaning of our friends can be a proxy to identify our own political preferences. This explanatory power is being leveraged in public policy, business decision-making and scientific research because it helps machine learning techniques to make accurate predictions. However, these generalizations often benefit the majority of people who shape the general structure of the network and puts in disadvantage under-represented groups who, as a consequence, get unfair treatments like segregation and discrimination. Therefore it becomes crucial to first understand how social networks form to then verify to what extent the way we connect to others helps to reinforce social inequalities as a feedback loop mechanism in machine learning. 

To this end, in the first part of my thesis, I propose HopRank and Janus two methods to characterize the mechanisms of edge formation of real-world social networks. HopRank is a biased random walker whose key concept is a model of information foraging on networks based on transition probabilities between k-hop neighborhoods. Janus is a Bayesian framework that allows us to identify and rank plausible hypotheses of edge formation in cases where nodes carry additional metadata. In the second part of my thesis, I investigate the implications of these mechanisms on machine learning. Specifically, I study the influence of homophily, preferential attachment, edge density, fraction of minorities, and the directionality of links on both the performance and bias of collective classification, and on the visibility of minorities in top-k ranks. My findings demonstrate a strong correlation between network structure and machine learning outcomes. This suggests that algorithmic bias on networks can be: (i) anticipated by the type of network, and (ii) mitigated by connecting strategically with certain people.

BIO / Lisette is a postdoctoral researcher at the Department of Network and Data Science. She will be working together with János Kertész and Márton Karsai. Her research interests lie at the intersection of Computational Social Science, Network Science and Machine Learning. In particular, she focuses on the understanding of edge formation and on the effects of network structure on ranking, inference and human navigation.