LucaSantagata

In this article, we delve into the application of network analysis theory, a branch of mathematics and computer science, and apply this to the world of “Game of Thrones.” Using Python libraries such as NetworkX, Pandas, and Pyvis, we will examine the intricate character relationships within the popular series. This analysis will provide insights into the dynamics of the narrative, identify key characters, and reveal the underlying connections that shape the fascinating world of Westeros.

In the course of this exploration, we will:

**Analyse the Game of Thrones dataset**to understand the character interactions and associations.**Visualise the networks using Networkx**and experiment with different layouts for the most readable representation.**Use Pyvis for interactive network visualisations**and compare its effectiveness with Networkx.**Perform community detection**to identify clusters or communities within the network based on the patterns of connections between nodes.

The datasets we will use offer an interesting glimpse into the intricate character relationships within the Game of Thrones series books. There are five CSV files, each file representing one of the books in the series. In the datasets, each row represents a connection between two characters, showcasing their interactions and associations. The dataset provides insights into the dynamics of the narrative, with the “weight” column offering a measure of the relationship’s significance or intensity. By analysing this dataset, we can unravel the complex web of relationships, identify key characters, study the evolution of networks, and gain a deeper understanding of the series’ plot lines and character dynamics. It is a useful tool for discovering how characters interact and revealing the underlying connections that influence the fascinating world of Westeros.

Columns:

**Source**: This column identifies the character from which a relationship originates.**Target**: This column designates the character at the receiving end of the relationship.**Type**: The type column describes the nature of the connection, indicating that all relationships are undirected, implying mutual interactions.**Weight**: The weight column assigns a numerical value to each relationship, providing a measure of its significance or intensity.**Book**: This column specifies the book number, enabling differentiation of relationships across multiple books. The data sets can be found on my GitHub.

The data sets can be found here

```
from google.colab import drive
drive.mount('/content/drive')
```

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

```
# Importing the required libraries
import networkx as nx
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
```

```
# Reading in the data of book 1
d1=pd.read_csv('/content/drive/MyDrive/Colab Notebooks/Complex Networks (blog)/book1.csv')
# Printing out the head of the data
d1
```

Source | Target | Type | weight | book | |
---|---|---|---|---|---|

0 | Addam-Marbrand | Jaime-Lannister | Undirected | 3 | 1 |

1 | Addam-Marbrand | Tywin-Lannister | Undirected | 6 | 1 |

2 | Aegon-I-Targaryen | Daenerys-Targaryen | Undirected | 5 | 1 |

3 | Aegon-I-Targaryen | Eddard-Stark | Undirected | 4 | 1 |

4 | Aemon-Targaryen-(Maester-Aemon) | Alliser-Thorne | Undirected | 4 | 1 |

... | ... | ... | ... | ... | ... |

679 | Tyrion-Lannister | Willis-Wode | Undirected | 4 | 1 |

680 | Tyrion-Lannister | Yoren | Undirected | 10 | 1 |

681 | Tywin-Lannister | Varys | Undirected | 4 | 1 |

682 | Tywin-Lannister | Walder-Frey | Undirected | 8 | 1 |

683 | Waymar-Royce | Will-(prologue) | Undirected | 18 | 1 |

684 rows × 5 columns

We can count the number of unique characters.

```
# Printing out the number of unique characters
print("Number of unique characters: ", len(d1['Source'].unique()))
```

Number of unique characters: 139

We can also calculate the average interactions per character.

```
# Number of interactions per character
interactions_per_character = d1.groupby('Source')['Target'].count()
print("Average number of interactions per character: ", round(interactions_per_character.mean(),3))
```

Average number of interactions per character: 4.921

Let’s plot the distribution of the number of interactions per character.

```
# Distribution of Interactions per Character
sns.set_style("whitegrid")
plt.figure(figsize=(10,6))
sns.histplot(interactions_per_character, bins=30, color='skyblue', edgecolor='black', kde=True)
plt.title('Distribution of Interactions per Character', fontsize=15)
plt.xlabel('Number of Interactions', fontsize=12)
plt.ylabel('Number of Characters', fontsize=12)
plt.show()
```

There is a strong right-skewed distribution with some outliers towards the right.

When calculating the top 10 characters, ranked on interactions, we can see which characters are outliers and are very central in this network.

```
# top 10 characters by number of interactions
top_characters = interactions_per_character.sort_values(ascending=False).head(10)
print("Top 10 characters by number of interactions: \n", top_characters)
```

Top 10 characters by number of interactions: Source Eddard-Stark 51 Catelyn-Stark 39 Bran-Stark 30 Arya-Stark 27 Cersei-Lannister 23 Joffrey-Baratheon 19 Daenerys-Targaryen 18 Jaime-Lannister 18 Jon-Snow 17 Drogo 15 Name: Target, dtype: int64

We can also analyse the edges rather than the nodes of a network. The code below will give you a list of the pairs of characters (edges) that have the most interactions (highest weights), and a histogram showing the distribution of edge weights. The edge weight is the sum of the ‘weight’ column for each pair of characters, which represents the number of interactions between them.

```
# Create a DataFrame that counts the number of interactions (weights) between each pair of characters
edge_weights = d1.groupby(['Source', 'Target'])['weight'].sum().reset_index(name='weight')
# Sort the DataFrame by the weight and display the top edges
top_edges = edge_weights.sort_values(by='weight', ascending=False).head(10)
print("Top 10 edges by weight: \n", top_edges)
```

Top 10 edges by weight: Source Target weight 329 Eddard-Stark Robert-Baratheon 291 134 Bran-Stark Robb-Stark 112 62 Arya-Stark Sansa-Stark 104 249 Daenerys-Targaryen Drogo 101 479 Joffrey-Baratheon Sansa-Stark 87 504 Jon-Snow Samwell-Tarly 81 454 Jeor-Mormont Jon-Snow 81 320 Eddard-Stark Petyr-Baelish 81 257 Daenerys-Targaryen Jorah-Mormont 75 225 Cersei-Lannister Robert-Baratheon 72

The top edge is between Eddard Stark and Robert Baratheon, with a weight of 291. This indicates that Eddard Stark and Robert Baratheon had 291 interactions, suggesting a significant level of connection or relationship between them. Ned Stark and Robert Baratheon had a close relationship that dated back to their youth. They were longtime friends and trusted allies. Their bond was formed during Robert’s Rebellion, a war that aimed to overthrow the Mad King Aerys II Targaryen and place Robert on the Iron Throne.

The second edge is between Bran Stark and Robb Stark, with a weight of 112. This implies that Bran Stark and Robb Stark had 112 interactions, indicating a strong connection between these two Stark brothers.

The third edge is between Arya Stark and Sansa Stark, with a weight of 104. This suggests that Arya Stark and Sansa Stark had 104 interactions, implying a significant level of connection or relationship between these two Stark sisters.

Let’s plot the distribution.

```
# Plot a histogram of edge weights
plt.figure(figsize=(10,6))
sns.histplot(edge_weights['weight'], bins=30, color='skyblue', edgecolor='black', kde=False)
plt.title('Distribution of Edge Weights', fontsize=15)
plt.xlabel('Edge Weight', fontsize=12)
plt.ylabel('Number of Edges', fontsize=12)
plt.show()
```

NetworkX is a Python library that provides tools for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks. It is widely used for tasks related to network analysis and graph theory.

We will be using it to make visualisations.

First, we create an empty graph object and iterate through the data frame to add edges.

```
# Creating an empty graph object
b1 = nx.Graph()
# Iterating through the DataFrame to add edges
for _, edge in d1.iterrows():
b1.add_edge(edge['Source'], edge['Target'], weight=edge['weight'])
```

```
# Printing out the number of nodes and edges in the graph
print("Total number of nodes: ", int(b1.number_of_nodes()))
print("Total number of edges: ", int(b1.number_of_edges()))
```

Total number of nodes: 187 Total number of edges: 684

I will plot the network of Book 1 using 3 different layouts from the Networkx library to see which one is the most readable.

**Nx.draw()**: default spring layout**Nx.draw_circular()**: Nodes are positioned in a circle around the centre**Nx.draw_kamada_kawai()**: Positions nodes using the force-directed method of Kamada and Kawai.

```
plt.figure(figsize =(20, 20))
nx.draw(b1, with_labels= True)
```