Analyzing Character Relationships in Game of Thrones Using NetworkX, Gephi, and Nebula Graph — Part 2

EpiK Protocol
7 min readOct 11, 2024

--

In the previous article, we showcased character relationships in Game of Thrones using NetworkX and Gephi. In this article, we will demonstrate how to access the Nebula Graph database using NetworkX.

NetworkX

NetworkX is a Python-based tool for graph theory and complex network modeling. It comes with a wide range of commonly used algorithms for graph and complex network analysis, making it convenient for tasks such as complex network data analysis and simulation modeling. It is feature-rich and user-friendly.

In NetworkX, a graph is a data structure composed of vertices, edges, and optional attributes. Vertices represent data, and edges, uniquely defined by two vertices, represent the relationship between them. Both vertices and edges can have additional attributes to store more information.

NetworkX supports four types of graphs:

  • Graph: Undirected graph
  • DiGraph: Directed graph
  • MultiGraph: Multigraph (undirected)
  • MultiDiGraph: Multigraph (directed)

Creating an Undirected Graph in NetworkX

Adding Vertices

Adding Edges

In the previous article, we demonstrated the Girvan-Newman community detection algorithm in NetworkX.

Nebula Graph Database

NetworkX typically uses local files as data sources, which is fine for static network studies. However, when the graph network frequently changes — such as when certain central nodes no longer exist (Fig. 1) or significant network topology changes are introduced (Fig. 2) — it becomes cumbersome to generate and load new static files each time. It’s better to persist the entire change process in a database and allow real-time loading of subgraphs or the entire graph for analysis. In this article, we will use Nebula Graph as the graph database for storing graph data.

Fig. 1

Fig. 2

Nebula Graph provides two methods for retrieving graph structures:

  1. Write a Query Statement to pull a subgraph.
  2. Full Scan of the Underlying Storage to obtain a complete graph.

The first method is suitable for obtaining a few specific nodes and edges in a large-scale graph network through detailed filtering and pruning conditions. The second method is more appropriate for analyzing the entire graph, usually performed during the early stages of a project for heuristic exploration, followed by detailed pruning analysis using the first method.

After analyzing the two methods of obtaining graph structures in Nebula Graph, let’s look at the Python client code for Nebula Graph. The files nebula-python/nebula/ngStorage/StorageClient.py and nebula-python/nebula/ngMeta/MetaClient.py serve as APIs for interacting with the underlying storage, featuring a rich set of interfaces for scanning vertices, edges, reading various attributes, and more.

The following two interfaces can be used to read all vertex and edge data:

Steps to Initialize and Use the Client

1.Initialize a Client and a Scan Edge Processor: The scan_edge_processor is used to decode the retrieved edge data.

2. Set Parameters for the scan_edge Interface:

3. Call the scan_edge Interface: This will return an iterator for a scan_edge_response object.

scan_edge_response_iterator = storage_client.scan_edge(space_name,

return_cols, all_cols, limit, start_time, end_time)

4. Continuously Read Data from the Iterator: Read all data until all entries are processed.

Here, process_edge is a user-defined function that processes the edge data read from the response. This function can use scan_edge_processor to decode the data, which can then be printed or further processed, such as loading into the NetworkX framework.

5. Process the Data: In this step, we add all retrieved edges to the graph G in NetworkX.

The method for reading vertex data follows a similar process as described above.

Furthermore, for distributed graph computation frameworks, Nebula Graph offers concurrent batch reading capabilities based on partitions, which will be demonstrated in future articles.

Performing Graph Analysis in NetworkX

After importing all node and edge data into NetworkX as described above, we can perform some basic graph analysis and calculations:

1.Drawing the Graph:

The resulting graph will be displayed.

2. Printing All Nodes and Edges in the Graph:

The output will be:

nodes: [109, 119, 129, 139, 149, 209, 219, 229, 108, 118, 128, 138, 148,

208, 218, 228, 107, 117, 127, 137, 147, 207, 217, 227, 106, 116, 126, 136,

146, 206, 216, 226, 101, 111, 121, 131, 141, 201, 211, 221, 100, 110, 120,

130, 140, 150, 200, 210, 220, 102, 112, 122, 132, 142, 202, 212, 222, 103,

113, 123, 133, 143, 203, 213, 223, 104, 114, 124, 134, 144, 204, 214, 224,

105, 115, 125, 135, 145, 205, 215, 225]

edges: [(109, 100), (109, 125), (109, 204), (109, 219), (109, 222), (119,

200), (119, 205), (119, 113), (129, 116), (129, 121), (129, 128), (129,

216), (129, 221), (129, 229), (129, 137), (139, 138), (139, 212), (139,

218), (149, 130), (149, 219), (209, 123), (219, 130), (219, 112), (219,

104), (229, 147), (229, 116), (229, 141), (229, 144), (108, 100), (108,

101), (108, 204), (108, 206), (108, 214), (108, 215), (108, 222), (118,

120), (118, 131), (118, 205), (118, 113), (128, 116), (128, 121), (128,

201), (128, 202), (128, 205), (128, 223), (138, 115), (138, 204), (138,

210), (138, 212), (138, 221), (138, 225), (148, 127), (148, 136), (148,

137), (148, 214), (148, 223), (148, 227), (148, 213), (208, 127), (208,

103), (208, 104), (208, 124), (218, 127), (218, 110), (218, 103), (218,

104), (218, 114), (218, 105), (228, 146), (228, 145), (107, 100), (107,

204), (107, 217), (107, 224), (117, 200), (117, 136), (117, 142), (127,

114), (127, 212), (127, 213), (127, 214), (127, 222), (127, 226), (127,

227), (137, 136), (137, 213), (137, 150), (147, 136), (147, 214), (147,

223), (207, 121), (207, 140), (207, 122), (207, 134), (217, 126), (217,

141), (217, 124), (217, 144), (106, 204), (106, 212), (106, 113), (116,

141), (116, 126), (116, 210), (116, 216), (116, 121), (116, 113), (116,

105), (126, 216), (136, 210), (136, 213), (136, 214), (146, 202), (146,

210), (146, 215), (146, 222), (146, 226), (206, 123), (216, 144), (216,

105), (226, 140), (226, 112), (226, 114), (226, 144), (101, 100), (101,

102), (101, 125), (101, 204), (101, 215), (101, 113), (101, 104), (111,

200), (111, 204), (111, 215), (111, 220), (121, 202), (121, 215), (121,

113), (121, 134), (131, 205), (131, 220), (141, 124), (141, 205), (141,

225), (201, 145), (211, 124), (221, 104), (221, 124), (100, 125), (100,

204), (100, 102), (100, 113), (100, 104), (100, 144), (100, 105), (110,

204), (110, 220), (120, 150), (120, 202), (120, 205), (120, 113), (140,

114), (140, 214), (140, 224), (150, 143), (150, 213), (200, 142), (200,

104), (200, 145), (210, 124), (210, 144), (210, 115), (210, 145), (102,

203), (102, 204), (102, 103), (102, 135), (112, 204), (122, 213), (122,

223), (132, 225), (202, 133), (202, 114), (212, 103), (222, 104), (103,

204), (103, 114), (113, 104), (113, 105), (113, 125), (113, 204), (133,

114), (133, 144), (143, 213), (143, 223), (203, 135), (213, 124), (213,

145), (104, 105), (104, 204), (104, 215), (114, 115), (114, 204), (134,

224), (144, 145), (144, 214), (204, 105), (204, 125)]

3.Calculating the Shortest Path Between Two Nodes:

The output will be:

4. Calculating the PageRank Value of Each Node:

The output will show the PageRank values for all nodes, such as:

{109: 0.011507076520104863, 119: 0.007835838669313514, 129:

0.015304593799331218, 139: 0.007772926737873626, 149:

0.0073896601012629825, 209: 0.0065558926178649985, 219:

0.014100908598251508, 229: 0.011454115940170253, 108: 0.01645334474680034,

118: 0.01010598371500564, 128: 0.01594717876199238, 138:

0.01671097227127263, 148: 0.015898676579503977, 208: 0.009437234075904938,

218: 0.0153795416919104, 228: 0.005900393773635255, 107:

0.009745182763645681, 117: 0.008716335675518244, 127:

0.021565565312365507, 137: 0.011642680498867146, 147:

0.009721031073465738, 207: 0.01040504770909835, 217: 0.012054472529765329,

227: 0.005615576255373405, 106: 0.007371191843767635, 116:

0.020955704443679106, 126: 0.007589432032220849, 136:

0.015987209357117116, 146: 0.013922108926721374, 206:

0.008554794629575304, 216: 0.011219193251536395, 226:

0.013613173390725904, 101: 0.016680863106330837, 111:

0.010121524312495604, 121: 0.017545503989576015, 131:

0.008531567756846938, 141: 0.014598319866130227, 201:

0.0058643663430632525, 211: 0.003936285336338021, 221:

0.009587911774927793, 100: 0.02243017302167168, 110: 0.007928429795381916,

120: 0.011875669801396205, 130: 0.0073896601012629825, 140:

0.01205992633948699, 150: 0.010045605782606326, 200: 0.015289870550944322,

210: 0.017716629501785937, 220: 0.008666577509181518, 102:

0.014865431161046641, 112: 0.007931095811770324, 122:

0.008087439927630492, 132: 0.004659566123187912, 142:

0.006487446038191551, 202: 0.013579313206377282, 212: 0.01190888044566142,

222: 0.011376739416933006, 103: 0.013438110749144392, 113:

0.02458154500563397, 123: 0.01104978432213578, 133: 0.00743370900670294,

143: 0.008011123394996112, 203: 0.006883198710237787, 213:

0.020392557117890422, 223: 0.012345866520333572, 104:

0.024902235588979776, 114: 0.019369722463816744, 124:

0.017165705442951484, 134: 0.008284361176173354, 144:

0.019363506469972095, 204: 0.03507634139024834, 214: 0.015500649025348538,

224: 0.008320315540621754, 105: 0.01439975542831122, 115:

0.007592722237637133, 125: 0.010808523955754608, 135:

0.006883198710237788, 145: 0.014654713389044883, 205:

0.014660118545887803, 215: 0.01337467974572934, 225: 0.009909720748343093}

Additionally, similar to the previous section, you can connect to Gephi to achieve better graph visualization effects. The code for this article can be referenced here.

Reference

[1] https://www.kaggle.com/mmmarchetti/game-of-thrones-dataset

[2] https://github.com/vesoft-inc/nebula

[3] https://networkx.github.io/

[4] https://gephi.org/

[5] https://github.com/jievince/nx2gephi

[6] https://www.lyonwj.com/2016/06/26/graph-of-thrones-neo4j-social-network-analysis/

[7]https://nebula-graph.com.cn/posts/game-of-thrones-relationship-networkx-gephi-nebula-graph/

[8] https://networkx.github.io/

[9] https://github.com/vesoft-inc/nebula

[10] https://spark.apache.org/graphx/

[11] https://gephi.org/

[12] https://github.com/vesoft-inc/nebula-python/pull/31

--

--