Anime Basic Information Knowledge Graph

EpiK Protocol
3 min readSep 8, 2024

--

Project Introduction

This project aims to construct a knowledge graph of relevant information about popular anime. The data collected includes the anime’s Chinese name, character design, genre, main voice actors, director, animation supervisor, and the first broadcast television station. Finally, the information will be visualized using the D3 visualization tool, and a search function will be provided. Users can enter the anime they wish to inquire about in the search box, and the webpage will display relevant information while hiding additional data.

1. Data

Data Crawling

The list of anime is sourced from Wikipedia, Bilibili, Douban, and Baidu, covering 345 popular Japanese anime from recent years. Relevant information about these anime was then crawled from Baidu Baike.

Data Cleaning

During the crawling process, missing information was initially replaced with “no information.” The cleaning process mainly focused on the “genre” data, categorizing all anime into 13 types based on the original genre strings: action, inspirational, fantasy, romance, musical, social, comedy, tragedy, mystery, horror, healing, daily life, and emotional. Missing information was categorized as “social.”

The final data file is named all.json, with each anime including information such as “main voice actors,” “director,” “original name,” “number of episodes,” “animation supervisor,” “Chinese name,” “production,” “character design,” “genre,” and “first broadcast television station.” In total, there are 345 anime data entries.

2. Visualization

After cleaning, the data is visualized using the D3 visualization tool. First, the data is transformed, storing related entities and relationships as a node structure, and using a link structure to represent the relationships between entities by connecting the nodes in pairs. The node elements can be displayed as colored solid circles based on their types, or as colored text. An example is shown in the figure below.

For instance, at the center of the graph is the “anime” node, which connects to specific type nodes such as “fantasy” and “action.” These four types then connect to the anime title node “Digimon,” from which other nodes are dispersed, including the director “Kazunori Kato,” and first broadcast television stations like “Yomiuri TV,” “Toei Animation,” and “Fuji TV,” among others. An example is illustrated in the figure.

Additionally, the application provides search functionality and the ability to hide extra data. Based on the data, users can perform queries to display related nodes. For example, if one inputs “Assassination Classroom main voice actors,” the webpage will hide unrelated nodes, retaining only the information for the queried anime. The resulting effect is depicted in the figure.

Furthermore, users can display only specific sub-branches based on mouse position, thereby hiding excessive visualization data. Specific applications are shown in the figure.

--

--