Continuing the beginner’s guide to using Python’s NetworkX library to conduct social network evaluation
In Part 1, we explored link evaluation, specifically social network evaluation in investigating and understanding relationships between individuals and entities. Then, we introduced social network evaluation (SNA), a selected form of link evaluation that focuses on people and groups and their relationships. We reviewed the essential concepts of SNA, including nodes (representing individuals) and edges (representing connections between individuals). Then, we discussed how SNA might be used to grasp social influence, group formation, and data flow using metrics equivalent to degree centrality and betweenness centrality using Billy Corgan and his relationship to the founding members of Smashing Pumpkins as a straightforward example.
In that example, we kept the network small and easy. On this tutorial, we are going to proceed to make use of Python and NetworkX to look at Billy Corgan’s sphere of influence. We will even expand Billy Corgan’s network to make it more complex and increase our understanding of degree centrality and betweenness centrality. As we work through this instance, we are going to discuss context and the way domain knowledge is crucial to maximizing the advantages of social network evaluation.
Social Network Evaluation in Context
Domain knowledge and research are essential components of social network evaluation because they supply the obligatory context, theoretical framework, and understanding of the social and cultural aspects that shape social networks. Without this understanding, you risk producing misleading or incorrect findings that fail to accurately capture the complexity and nuance of social network data.
Before you begin…
- Do you have got basic knowledge of Python? If not, start here.
- Are you acquainted with basic concepts in social network evaluation, like nodes and edges, or metrics like centrality? If not, start here.
Gathering Data to Analyze Social Networks
So what kind of information do we want to start out investigating Billy Corgan’s sphere of influence? Let’s start with all of his bandmates from the Smashing Pumpkins, current and former.
Using Wikipedia, we are able to get a reasonably reliable list of all of the musicians that played within the Smashing Pumpkins since 1988. By the way in which — did you already know that Billy Corgan (briefly) had one other band named Zwan within the early aughties? Spoiler alert, it didn’t end well. Let’s make an inventory of them too.
Then, open up your favorite IDE, import the relevant libraries, and make two lists — one for Smashing Pumpkins and one for Zwan.
Describing Relationships in Social Networks
Our next task is to construct out some lists of tuples to represent the relationships between Billy Corgan and every of those band members. We also need to think about the connection between each of the band members and all the other band members.
In graph theory, this type of relation is often known as symmetric. If Billy is in a band with Jimmy, Jimmy can be in a band with Billy.
To perform this, we are able to use Python to construct a straightforward function that can ingest each list of band members and return all of the possible combos of the pairs.
Then, we are able to apply to every list and mix the outcomes to create an inventory of tuples that contain the relationships between all of the band members of Zwan and the Smashing Pumpkins.
The output will look something like this:
[('Billy Corgan', 'James Iha'),
('Billy Corgan', 'Jimmy Chamberlin'),
('Billy Corgan', 'Katie Cole'),
('Billy Corgan', "D'arcy Wretzky"),
('Billy Corgan', 'Melissa Auf der Maur'),
('Billy Corgan', 'Ginger Pooley'),
('Billy Corgan', 'Mike Byrne'),
('Billy Corgan', 'Nicole Fiorentino'),
('James Iha', 'Jimmy Chamberlin'),
('James Iha', 'Katie Cole'),
('James Iha', "D'arcy Wretzky"),
('James Iha', 'Melissa Auf der Maur'),
('James Iha', 'Ginger Pooley'),
('James Iha', 'Mike Byrne'),
('James Iha', 'Nicole Fiorentino'),
('Jimmy Chamberlin', 'Katie Cole'),
('Jimmy Chamberlin', "D'arcy Wretzky"),
('Jimmy Chamberlin', 'Melissa Auf der Maur'),
('Jimmy Chamberlin', 'Ginger Pooley'),
('Jimmy Chamberlin', 'Mike Byrne'),
('Jimmy Chamberlin', 'Nicole Fiorentino'),
('Katie Cole', "D'arcy Wretzky"),
('Katie Cole', 'Melissa Auf der Maur'),
('Katie Cole', 'Ginger Pooley'),
('Katie Cole', 'Mike Byrne'),
('Katie Cole', 'Nicole Fiorentino'),
("D'arcy Wretzky", 'Melissa Auf der Maur'),
("D'arcy Wretzky", 'Ginger Pooley'),
("D'arcy Wretzky", 'Mike Byrne'),
("D'arcy Wretzky", 'Nicole Fiorentino'),
('Melissa Auf der Maur', 'Ginger Pooley'),
('Melissa Auf der Maur', 'Mike Byrne'),
('Melissa Auf der Maur', 'Nicole Fiorentino'),
('Ginger Pooley', 'Mike Byrne'),
('Ginger Pooley', 'Nicole Fiorentino'),
('Mike Byrne', 'Nicole Fiorentino'),
('Billy Corgan', 'Jimmy Chamberlin'),
('Billy Corgan', 'Paz Lenchantin'),
('Billy Corgan', 'David Pajo'),
('Billy Corgan', 'Matt Sweeney'),
('Jimmy Chamberlin', 'Paz Lenchantin'),
('Jimmy Chamberlin', 'David Pajo'),
('Jimmy Chamberlin', 'Matt Sweeney'),
('Paz Lenchantin', 'David Pajo'),
('Paz Lenchantin', 'Matt Sweeney'),
('David Pajo', 'Matt Sweeney')]
Next, we are able to loop over the list of tuples to generate a graph with Network X.
Which generates this graph:
Let’s discuss two key observations that might be gleaned concerning the network from this graph.
- The upper right corner where the Smashing Pumpkins band members appear is more complex than the lower left corner where the members of Zwan are because there are fewer members in Zwan.
- Billy Corgan and Jimmy Chamberlin appear in the middle because they’re in each bands.
Next, let’s consider how these observations could also be reflected in degree centrality and betweenness centrality.
Degree Centrality and Betweenness Centrality with NetworkX
In Part 1, we calculated the degree centrality and betweenness centrality for Billy Corgan and the founding members of the Smashing Pumpkins. To perform this, we called on two methods in NetworkX, and wrote a straightforward script to execute them. This time, since we have now our graph assembled, we are able to simply input the graph to calculate the centrality measures.
This may generate the next output:
Let’s discuss the right way to interpret these results.
What does this table tell us concerning the degree centrality of all the band members?
1. Billy Corgan has the best degree centrality rating of 1.000, indicating that he has the best variety of connections or collaborations inside Smashing Pumpkins and Zwan. He’s directly connected to each other member of each of the bands.
2. Jimmy Chamberlin also has a level centrality rating of 1.000, suggesting that he, like Billy Corgan, has direct connections to each other member of the 2 bands.
3. James Iha, Katie Cole, D’arcy Wretzky, Melissa Auf der Maur, Ginger Pooley, Mike Byrne, Nicole Fiorentino, Paz Lenchantin, David Pajo, and Matt Sweeney all have the identical degree centrality rating of 0.727273, suggesting that they’ve similar levels of connections or collaborations throughout the bands.
What does this table tell us concerning the betweenness centrality of all the band members?
1. Billy Corgan and Jimmy Chamberlin even have the best betweenness centrality scores of 0.190909, indicating that they’re likely vital intermediaries or bridges between other band members by way of communication or collaboration.
2. Not one of the band members, except Billy Corgan and Jimmy Chamberlin, have a non-zero betweenness centrality rating, indicating that they usually are not central by way of bridging connections between other members.
Strengthening Inferences with Domain Knowledge
While centrality metrics provide data points from which we are able to draw inferences, these inferences are based solely on the knowledge provided within the table.
To make more specific conclusions about Billy Corgan’s sphere of influence, you would wish knowledge regarding nineties alternative music and musicians to supply a fully-fledged hypothesis on the dynamics between the members of those bands.
So when you are a nineties music aficionado, let me know what you consider these leads to the comments. Remember to stay tuned for Part 3, where we expand the network so we are able to explore closeness centrality, clustering, and communities in social network evaluation.
When you would love the fully annotated Python script for this tutorial, visit my GitHub!
👩🏻💻 Christine Egan | medium | github | linkedin