In this article, we delve into the application of network analysis theory, a branch of mathematics and computer science, and apply this to the world of “Game of Thrones.” Using Python libraries such as NetworkX, Pandas, and Pyvis, we will examine the intricate character relationships within the popular series. This analysis will provide insights into the dynamics of the narrative, identify key characters, and reveal the underlying connections that shape the fascinating world of Westeros.
In the course of this exploration, we will:
The datasets we will use offer an interesting glimpse into the intricate character relationships within the Game of Thrones series books. There are five CSV files, each file representing one of the books in the series. In the datasets, each row represents a connection between two characters, showcasing their interactions and associations. The dataset provides insights into the dynamics of the narrative, with the “weight” column offering a measure of the relationship’s significance or intensity. By analysing this dataset, we can unravel the complex web of relationships, identify key characters, study the evolution of networks, and gain a deeper understanding of the series’ plot lines and character dynamics. It is a useful tool for discovering how characters interact and revealing the underlying connections that influence the fascinating world of Westeros.
Columns:
The data sets can be found here
from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
# Importing the required libraries
import networkx as nx
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Reading in the data of book 1
d1=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Complex Networks (blog)/book1.csv')
# Printing out the head of the data
d1
Source | Target | Type | weight | book | |
---|---|---|---|---|---|
0 | Addam-Marbrand | Jaime-Lannister | Undirected | 3 | 1 |
1 | Addam-Marbrand | Tywin-Lannister | Undirected | 6 | 1 |
2 | Aegon-I-Targaryen | Daenerys-Targaryen | Undirected | 5 | 1 |
3 | Aegon-I-Targaryen | Eddard-Stark | Undirected | 4 | 1 |
4 | Aemon-Targaryen-(Maester-Aemon) | Alliser-Thorne | Undirected | 4 | 1 |
... | ... | ... | ... | ... | ... |
679 | Tyrion-Lannister | Willis-Wode | Undirected | 4 | 1 |
680 | Tyrion-Lannister | Yoren | Undirected | 10 | 1 |
681 | Tywin-Lannister | Varys | Undirected | 4 | 1 |
682 | Tywin-Lannister | Walder-Frey | Undirected | 8 | 1 |
683 | Waymar-Royce | Will-(prologue) | Undirected | 18 | 1 |
684 rows × 5 columns
We can count the number of unique characters.
# Printing out the number of unique characters
print("Number of unique characters: ", len(d1['Source'].unique()))
Number of unique characters: 139
We can also calculate the average interactions per character.
# Number of interactions per character
interactions_per_character = d1.groupby('Source')['Target'].count()
print("Average number of interactions per character: ", round(interactions_per_character.mean(),3))
Average number of interactions per character: 4.921
Let’s plot the distribution of the number of interactions per character.
# Distribution of Interactions per Character
sns.set_style("whitegrid")
plt.figure(figsize=(10,6))
sns.histplot(interactions_per_character, bins=30, color='skyblue', edgecolor='black', kde=True)
plt.title('Distribution of Interactions per Character', fontsize=15)
plt.xlabel('Number of Interactions', fontsize=12)
plt.ylabel('Number of Characters', fontsize=12)
plt.show()
There is a strong right-skewed distribution with some outliers towards the right.
When calculating the top 10 characters, ranked on interactions, we can see which characters are outliers and are very central in this network.
# top 10 characters by number of interactions
top_characters = interactions_per_character.sort_values(ascending=False).head(10)
print("Top 10 characters by number of interactions: \n", top_characters)
Top 10 characters by number of interactions: Source Eddard-Stark 51 Catelyn-Stark 39 Bran-Stark 30 Arya-Stark 27 Cersei-Lannister 23 Joffrey-Baratheon 19 Daenerys-Targaryen 18 Jaime-Lannister 18 Jon-Snow 17 Drogo 15 Name: Target, dtype: int64
We can also analyse the edges rather than the nodes of a network. The code below will give you a list of the pairs of characters (edges) that have the most interactions (highest weights), and a histogram showing the distribution of edge weights. The edge weight is the sum of the ‘weight’ column for each pair of characters, which represents the number of interactions between them.
# Create a DataFrame that counts the number of interactions (weights) between each pair of characters
edge_weights = d1.groupby(['Source', 'Target'])['weight'].sum().reset_index(name='weight')
# Sort the DataFrame by the weight and display the top edges
top_edges = edge_weights.sort_values(by='weight', ascending=False).head(10)
print("Top 10 edges by weight: \n", top_edges)
Top 10 edges by weight: Source Target weight 329 Eddard-Stark Robert-Baratheon 291 134 Bran-Stark Robb-Stark 112 62 Arya-Stark Sansa-Stark 104 249 Daenerys-Targaryen Drogo 101 479 Joffrey-Baratheon Sansa-Stark 87 504 Jon-Snow Samwell-Tarly 81 454 Jeor-Mormont Jon-Snow 81 320 Eddard-Stark Petyr-Baelish 81 257 Daenerys-Targaryen Jorah-Mormont 75 225 Cersei-Lannister Robert-Baratheon 72
The top edge is between Eddard Stark and Robert Baratheon, with a weight of 291. This indicates that Eddard Stark and Robert Baratheon had 291 interactions, suggesting a significant level of connection or relationship between them. Ned Stark and Robert Baratheon had a close relationship that dated back to their youth. They were longtime friends and trusted allies. Their bond was formed during Robert’s Rebellion, a war that aimed to overthrow the Mad King Aerys II Targaryen and place Robert on the Iron Throne.
The second edge is between Bran Stark and Robb Stark, with a weight of 112. This implies that Bran Stark and Robb Stark had 112 interactions, indicating a strong connection between these two Stark brothers.
The third edge is between Arya Stark and Sansa Stark, with a weight of 104. This suggests that Arya Stark and Sansa Stark had 104 interactions, implying a significant level of connection or relationship between these two Stark sisters.
Let’s plot the distribution.
# Plot a histogram of edge weights
plt.figure(figsize=(10,6))
sns.histplot(edge_weights['weight'], bins=30, color='skyblue', edgecolor='black', kde=False)
plt.title('Distribution of Edge Weights', fontsize=15)
plt.xlabel('Edge Weight', fontsize=12)
plt.ylabel('Number of Edges', fontsize=12)
plt.show()
NetworkX is a Python library that provides tools for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. It is widely used for tasks related to network analysis and graph theory.
We will be using it to make visualisations.
First, we create an empty graph object and iterate through the data frame to add edges.
# Creating an empty graph object
b1 = nx.Graph()
# Iterating through the DataFrame to add edges
for _, edge in d1.iterrows():
b1.add_edge(edge['Source'], edge['Target'], weight=edge['weight'])
# Printing out the number of nodes and edges in the graph
print("Total number of nodes: ", int(b1.number_of_nodes()))
print("Total number of edges: ", int(b1.number_of_edges()))
Total number of nodes: 187 Total number of edges: 684
I will plot the network of Book 1 using 3 different layouts from the Networkx library to see which one is the most readable.
plt.figure(figsize =(20, 20))
nx.draw(b1, with_labels= True)
plt.figure(figsize =(20, 20))
nx.draw_circular(b1, with_labels= True)
plt.figure(figsize =(20, 20))
nx.draw_kamada_kawai(b1, with_labels= True)
All of these graphs give some indication of how the network works, but I do not find it easy to read. Let us look for another visualisation library.
After some further research, I found the Pyvis library. Pyvis is a Python library to visualise networks using vis.js. It is a port of the popular R package, networkD3. It is primarily used in Jupyter Notebook, but it also has the ability to generate HTML files.
As I did not find the visualisation of the network using nx.draw() very readable, I will use Pyvis to visualise the network of Book 1.
from pyvis.network import Network
net = Network(notebook=True, height='950px', width='95%', bgcolor='#222222', font_color='white', cdn_resources='in_line')
net.repulsion()
node_degree = dict(b1.degree())
# set node size and atrributes
nx.set_node_attributes(b1, node_degree, 'size')
net.from_nx(b1)
net.save_graph("GameofThrones.html")
from IPython.display import HTML
HTML(filename="GameofThrones.html")
The HTML file can be visualized and downloaded here. For the interactive visualization is worth it :)
It is already a nice visualisation, but it is still difficult to read. We can try to use community detection to see if it is possible to make it more readable.
Network community theory is a concept in social network analysis that examines how individuals or groups form and interact within communities or clusters within a larger network. It focuses on understanding the structure and dynamics of these communities and their implications for various social phenomena.
Community detection algorithms are used to identify clusters or communities within the network based on the patterns of connections between nodes.
The theory explores the notion that individuals within a community tend to be more closely connected to each other than to nodes outside the community. These communities can exhibit characteristics such as a higher density of connections, stronger ties between members, and shared interests or attributes.
We can easily do a community detection using the Community python library.
from community import community_louvain
communities = community_louvain.best_partition(b1)
communities
# Adding the community to the nodes
nx.set_node_attributes(b1, communities, 'group')
net = Network(notebook=True, height='950px', width='95%', bgcolor='#222222', font_color='white', cdn_resources='in_line')
net.repulsion()
node_degree = dict(b1.degree())
# set node size and atrributes
nx.set_node_attributes(b1, node_degree, 'size')
net.from_nx(b1)
net.save_graph("GameofThronesCommunities.html")
HTML(filename="GameofThronesCommunities.html")
The HTML file can be visualized and downloaded here. For the interactive visualization is worth it :)
You can clearly see the different communities detected. Jon Snow and Daenerys Targaryen both clearly have their own network and social circle in Book 1. This can be explained by the fact that both characters are at the end of the world. Jon find himself on The Wall in the north, and Daenerys in Pentos, before heading even more east into Essos. The rest of the narrative mainly takes place in Westeros.
In conclusion, the article demonstrated how network analysis techniques can be applied to analyse the character relationships in “Game of Thrones.” By visualising the networks, identifying key characters, and performing community detection, valuable insights were gained into the narrative dynamics and the underlying connections that shape the world of Westeros.