Analyzing Character Relationships in Game of Thrones Using NetworkX, Gephi, and Nebula Graph — Part 1

EpiK Protocol
5 min readOct 8, 2024

--

As we all know, Game of Thrones has a large number of loyal fans worldwide. Apart from the unexpected “surprises” of who might die next, the complex and intertwined character relationships are one of the reasons for its popularity. This article introduces how to access the open-source distributed graph database Nebula Graph using NetworkX and visualize the complex character relationships in Game of Thrones with the help of the visualization tool Gephi.

Dataset

The dataset used in this article is sourced from A Song of Ice and Fire, Volume 1 (to Volume 5) [1].

  • Character Set (Nodes): Each character in the book is modeled as a node, with a single attribute: name.
  • Relationship Set (Edges): An edge exists between two characters if they have had direct or indirect interactions in the book; each edge has a single attribute: weight, which represents the strength of the interaction.

This collection of nodes and edges forms a network, stored in the graph database Nebula Graph [2].

Community Detection — Girvan-Newman Algorithm

We use the built-in community detection algorithm Girvan-Newman from NetworkX [3] to identify communities within our network.

Explanation of the Girvan-Newman Algorithm

In a network graph, tightly connected parts can be considered as communities. Nodes within each community have strong connections, while connections between different communities are sparser. Community detection is the process of finding these distinct communities within a given network graph.

The Girvan-Newman algorithm is a betweenness-based community detection algorithm. Its core idea is to iteratively remove edges based on their edge betweenness centrality, from highest to lowest, until the entire network is divided into separate communities. Thus, the Girvan-Newman algorithm effectively functions as a splitting method.

Basic Steps of the Girvan-Newman Algorithm:

  1. Calculate the edge betweenness for all edges in the network.
  2. Identify the edge with the highest betweenness and remove it from the network.
  3. Repeat step 2 until each node becomes an independent community, i.e., no edges remain in the network.

With the concepts explained, let’s move into practical implementation.

1.Use the Girvan-Newman algorithm to detect communities. Example code in NetworkX:

2.Add a community attribute to each node in the graph, recording the community number.

Node Styling — Betweenness Centrality Algorithm

Next, we will adjust the node sizes and the font size of the character names using the Betweenness Centrality algorithm from NetworkX to determine the size of the nodes and the labels.

The importance of each node in the graph can be measured by its centrality. Different networks often adopt various definitions of centrality to describe the importance of nodes. Betweenness Centrality assesses a node’s importance based on the number of shortest paths that pass through it.

1.Calculate the betweenness centrality for each node.

betweenness_dict = nx.betweenness_centrality(G) # Run betweenness

centrality

2.Add a betweenness attribute to each node in the graph.

nx.set_node_attributes(G, betweenness_dict, ‘betweenness’)

Edge Thickness

The thickness of the edges is determined directly by the weight attribute of the edges. After the above processing, our nodes now possess three attributes: name, community, and betweenness, while the edges have a single weight attribute.

Visualization

To visualize the graph, you can use the following code:

While NetworkX has several visualization features, Gephi [4] offers better interactivity and visualization effects.

Integrating with the Visualization Tool Gephi

Now, let’s export the NetworkX data to a file named game.gexf and import it into Gephi.

Gephi Visualization Demonstration

Open the exported game.gexf file in Gephi, and fine-tune the various parameters to achieve a satisfactory visualization:

  1. Set the Layout to Force Atlas: Adjust the repulsion strength to 500.0 and check the “Adjust by Size” option to minimize node overlap.
    Force Atlas is a force-directed layout method that produces aesthetically pleasing network layouts and effectively showcases the overall structure and isomorphic features of the network. It mimics the forces of attraction and repulsion in the physical world, automatically adjusting until equilibrium is reached.
  2. Color Each Community Differently:
    In the Appearance panel, go to Nodes → Color → Partition and select community (this is the community attribute we added to each node earlier).
  3. Determine the Size of Nodes and the Font Size of Character Names:
    In the Appearance panel, go to Nodes → Size → Ranking and select betweenness (this is the betweenness attribute we added to each node).
  4. Edge Thickness Based on Weight:
    In the Appearance panel, go to Edges → Size → Ranking and select the edge weight.
  5. Export the Graph Image with Avatar Effects.

Each node will display the corresponding character information.

This article primarily introduces how to use NetworkX and visualize the data with Gephi. The next installment will cover how to access data in the Nebula Graph database using NetworkX.

--

--