While many projects are causing sustainability issues by relying on short-term narratives and airdrops with a short-sighted perspective, there are certainly protocols that have steadily developed and refined their products over the past few years.
The Graph is a prime example of such projects. As a data indexing/querying infrastructure, The Graph has provided data querying and indexing services to various applications and blockchains over the past three years. It is an encouraging case, as usage has steadily increased even without airdrops or point systems.
Moreover, The Graph aims to become a core infrastructure in the integration of AI and crypto by leveraging the infrastructure built over the past three years, and it is already releasing interesting products like AgentC. Given The Graph’s consistent actions over words, the products it will unveil in the future are highly anticipated.
As I mentioned in previous articles, consistency and sustainability are just as important as new technologies and narratives in this market (at least, that’s my belief). As a researcher, my duty is not only to explain and introduce new narratives and industry trends but also to spotlight projects that consistently reflect their values and concerns in this market, narrating their history and future potential. In this regard, the protocol I introduce today deserves attention due to its intriguing history and transformations. Moreover, with the ongoing integration of blockchains and data services and the AI use-cases in various fields, I believe the role of today’s protocol will become increasingly significant. The protocol I introduce today is The Graph, which has been the longest-standing decentralized indexer in the market. In this article, I will focus on why now is the right time to pay attention to The Graph and why its future growth potential is undeniable.
Why am I discussing The Graph now? At the protocol level, The Graph is currently at a significant inflection point (as I will elaborate later, it is transitioning from a simple indexing protocol to a decentralized data service protocol). Furthermore, protocols like The Graph that generate consistent and regular demand (for their services) are extremely rare in the market. Most blockchains and the myriad applications running on them struggle to organically attract users and often resort to incentives like points and airdrops to temporarily boost user engagement. In contrast, The Graph has created sustained demand for its protocol without such visible incentives(Currently averaging 1b queries per month right now and growing).
In this regard, if someone values the fundamentals of blockchain, I believe The Graph stands as an exemplary answer. Moreover, considering the future landscape of the blockchain market, where the integration of AI and blockchain will become increasingly important, I foresee a growing demand for data indexers like The Graph. Through this report, I aim to explore the history and evolution of The Graph, as well as the changes in market conditions that drive demand for The Graph, to explain why we should pay attention to it now.
Before delving into the details of The Graph, let’s first briefly examine the two main reasons why I am focusing on it.
1.1.1 In an Era Where AI and Blockchain Merge, The Graph Can Become the Core Infrastructure
In the current blockchain market, there is a lively discussion about the potential synergies that the combination of AI and blockchain can bring. Some projects aim to verify human identity to distinguish between AI and humans, while others focus on sharing surplus computing resources to meet the demands of AI. The market is buzzing with a variety of ideas. However, from my perspective, the most clear-cut synergy between these two technologies lies in leveraging on-chain data to make blockchain applications more user-friendly.
Let me give you an example. If AI were to learn all the data related to decentralized exchanges (DEXs) within the Ethereum ecosystem, it could aggregate the fragmented information across various Ethereum DEXs to facilitate smarter trading. Taking it a step further, if agents could perform complex tasks on behalf of users (such as completing necessary steps to receive airdrops), the on-chain experience for users would be significantly improved. However, there is a prerequisite for these operations: on-chain data indexing.
AI requires data to perform computations, and in scenarios where on-chain data needs to be used regularly, it must be accessed quickly and efficiently for computation. Now, imagine if the entity providing the data also supported the computing resources needed for AI operations. Remarkably, this is exactly the direction in which The Graph is heading. The Graph not only possesses vast amounts of on-chain data but also has nodes that maintain the network and provide computing resources. This dual capability makes The Graph a crucial infrastructure resource for AI, offering both the data and the computing resources required.
Therefore, as more AI-related services begin to utilize on-chain data, the necessity for data service protocols like The Graph will increase, which is why we should pay attention to it.
1.1.2 In an Age of Points and Airdrops, The Graph Is a Case of Creating Real Demand
When Pacman from Blur, now a leading NFT marketplace, introduced a point system, it effectively attracted initial users. However, the overuse of any strategy diminishes its value. Following Blur’s success, many projects now artificially create demand using point systems, focusing on this rather than the quality of their products. However, such artificial demand is unsustainable. When points convert to tokens, users will likely abandon these protocols. Many protocols resort to points or airdrops because they haven’t found true product-market fit. People seem tired of points and airdrops. The Graph is a significant case because it identified blockchain market issues and necessary products, providing solutions that created organic and sustained demand.
While the blockchain market is often driven by narratives and short-term hype, those who value protocol fundamentals should definitely watch The Graph.
Understanding The Graph's importance is one thing, but what exactly does it do? Before explaining in detail, I assert that The Graph is likely one of the most used protocols people aren’t aware of. If you’ve ever used Farcaster, Lido, Uniswap, ENS, Aave, Compound, GMX, Sushiswap, or Curve, you’ve indirectly used The Graph. Let's explore its primary roles, structure, and operational mechanism.
1.2.1 The Graph as a Data Indexing and Querying Infrastructure
The most well-known role of The Graph is to index and query data on the blockchain. What is data indexing and querying? Indexing helps locate data stored in a database more easily. For instance, instead of reading a book from the first page to find specific information, indexing is like marking where the desired information is, making retrieval efficient. Similarly, an indexer helps quickly locate data by creating a data structure that includes the data and its location in the database.
Data querying refers to the process of requesting specific data from a database. In other words, The Graph helps to easily locate data on the blockchain and delivers the requested data when a query is made. To further clarify, let’s explain the process by which The Graph queries data as follows:
Lifecycle of Data Querying
Users interact with the front-end of a DApp.
The DApp requests data from The Graph's SQL gateway.
The SQL gateway finds an indexer that can provide the requested data.
The selected indexer, having already stored the requested data, extracts and provides it, receiving a fee for the query.
The gateway delivers the data to the DApp.
The DApp displays or utilizes the data on the front-end.
Indexing and Querying Blockchain Data is hard
While this sounds simple, indexing and querying blockchain data is complex. Blockchains have properties like finality, potential chain reorganizations, and orphan blocks, making efficient data retrieval challenging. Directly extracting data from smart contracts isn’t too hard, but using data not directly available in contracts requires complex processing, making efficient blockchain data provision a tough task. How does The Graph manage this complexity?
1.2.2 A Few Concepts We Need to Know
Before diving into The Graph's structure and participants, we need to understand GraphQL and Subgraphs:
GraphQL: A query language for APIs, created by Facebook (now Meta), known for being more efficient than REST APIs. While the name includes "Graph," it’s not exclusive to The Graph and has been widely adopted since Facebook open-sourced it.
Subgraphs: Custom APIs for blockchain data, used exclusively by The Graph. Subgraphs define how data is collected, organized, and accessed. Creating and using subgraphs involves several steps:
Install GraphCLI: A command-line tool provided by The Graph to manage subgraphs.
Set Up Initial State and Project Structure: Create the project structure with templates for Manifest File, Schema, and Mapping Script.
Define Schema: Specify the types of data needed and their relationships.
Write Manifest: Guide what data to index from the blockchain.
Write Mapping Scripts: Describe how to transform raw data into the defined schema.
Deploy Subgraph to The Graph: Use the schema, manifest, and mapping scripts for data indexing.
Query Data with GraphQL: Once data is organized, developers can query it using GraphQL.
Understanding these concepts is crucial for explaining The Graph. Let’s now explore The Graph's structure in detail.
1.2.3 Structure of the Graph
The Graph Network is composed of Indexers, Curators, Delegators, and Developers. They all contribute to providing data to web3 applications. Let’s look at their respective roles:
Indexers Indexers perform roles similar to validators or nodes in Layer 1 networks. To become an Indexer, one must stake Graph Tokens (GRT) in the network. The primary role of Indexers is to provide indexing and query services. Like validators in Layer 1 networks, Indexers are rewarded with transaction fees and inflation incentives. If an Indexer acts maliciously, their staked tokens are slashed, which ties GRT staking closely to the overall security of the protocol.
Indexers do not index all data. They only index subgraphs that are signaled by Curators, as will be explained below.
Currently, the Graph Network has over 140 Indexers, which is a sufficient number compared to other blockchains.
Delegators Delegators are those who delegate their GRT tokens to Indexers, similar to how tokens are staked and delegated in dPoS (Delegated Proof of Stake) systems. Delegators earn a portion of the query fees that Indexers receive (Note: Delegation involves a 0.5% tax and a 28-day unbonding period, so if you plan to delegate GRT tokens, be sure to understand these conditions).
Curators Curators recommend suitable subgraphs to Indexers. To incentivize good curation, The Graph created the Graph Curation Shares (GCS) token. This token allows Curators to earn a portion of the query fees generated by the subgraphs they curate. High-quality subgraphs are likely to generate more query fees, aligning the interests of Curators and Indexers.
Developers While the above three roles are contributors from a supply perspective, Developers are contributors from the demand side. The primary customers of The Graph are developers who use the data. As mentioned earlier, Developers create and submit subgraphs to The Graph Network.
1.2.4 The Graph's Efficiency
So far, we have looked at the concepts and participants necessary to understand The Graph. Now, let’s explore how The Graph achieves efficiency.
The concept of Subgraphs is a custom indexing solution tailored to the type and nature of data, making querying more efficient. Additionally, GraphQL, the query language used, is very efficient and further enhances query performance. Lastly, GraphCLI, developed by The Graph, allows developers to easily manage and deploy subgraphs, making even the complex tasks of blockchain data indexing and querying quick and efficient.
Furthermore, The Graph continually strives to improve its indexing and querying environment. These efforts have resulted in innovations like Firehose and Substreams, which will be discussed in more detail later.
1.2.5 Farcaster Frames as example(how the graph is being used)
Source: limone.eth
To make the concepts explained so far easier to understand, let’s take a look at a widely-used application, Farcaster, as an example. Farcaster’s most famous feature is the Frame, which allows applications to be embedded into social media posts. Users can mint NFTs or enjoy games through Frames. Since most Frames involve embedding on-chain applications on the web, on-chain data is necessary. In these cases, developers can use The Graph’s subgraphs to fetch the required data for Frames.
For instance, consider the example of 3070, which created an NFT browser using an NFT subgraph indexing data from Ethereum and Base. Similarly, limone utilized The Graph to fetch on-chain event data to create a Frame for voting on on-chain proposals. These examples demonstrate how The Graph can simplify the process of building Frames.
Currently, The Graph actively participates in Farcaster’s bounty program, BountyCaster, encouraging Farcaster users to utilize The Graph in various ways.
Source: the Graph x AI whitepaper
So far, we have explored what The Graph protocol does, how it operates, and who the participants are. However, as I mentioned at the beginning, the purpose of this article is not merely to explain The Graph itself (though that is important), but to discuss why The Graph will become a crucial infrastructure in the future—not just as an indexer, but as a decentralized data service protocol. Therefore, in this section, we will delve into the first of the two main reasons I briefly mentioned earlier: the integration of AI and blockchain, and why The Graph will play a vital role between the two.
To fully utilize AI, data is just as essential as computing resources. Without data, even the most advanced AI cannot provide accurate answers, as these systems learn from historical data. In this context, The Graph’s extensive history of indexing on-chain data becomes a highly valuable asset. If AI needs to efficiently learn from on-chain data, it would be difficult to find a better source than the subgraphs within The Graph network. Additionally, the vast indexer infrastructure that The Graph has built also provides the necessary computing resources to operate AI.
In other words, The Graph aims to become the core infrastructure for both AI and blockchain technologies by offering the computing resources needed to run AI and providing the vast amount of data required for learning. Even if new competitors emerge, it will not be easy to overcome the extensive data and indexer network that The Graph has established.
So, what specific AI services can we expect by leveraging The Graph? In my view, there are two main types of services**: 1) inference services and 2) agent services.** Let’s take a closer look at each of these services.
Inference services refer to the hosting of AI models on The Graph network. Developers can enhance user experience by integrating features similar to ChatGPT into the front end of their applications. This capability is possible because it leverages the computing assets of indexers, the nodes of The Graph network. The availability of inference services indicates The Graph’s evolution from merely an indexing and querying infrastructure to one that also hosts AI. Hosting AI models on The Graph involves a process similar to data indexing and querying, outlined as follows:
Users interact with the frontend of an AI dApp.
AI dApps request the desired inference model from The Graph’s AI gateway.
The AI gateway identifies an indexer capable of providing the requested inference model.
The selected indexer uploads the requested model, performs the inference, returns the result to the gateway, and receives a fee for the service.
The gateway transmits the data to the AI dApp.
The AI dApp displays or utilizes the data on its frontend.
Why should developers host AI models on The Graph? There are several compelling reason: 1)Unlike centralized services, which may impose restrictions based on their policies, The Graph is decentralized, eliminating concerns about such limitations. 2)Operating dedicated hardware independently can be cost-prohibitive and requires specialized knowledge of AI models. The Graph provides a censorship-free, open marketplace where developers can easily select models that meet their needs, making it a cost-effective and optimized choice.
Agent Service involves third parties performing complex tasks. These services not only process data but also transform natural language into queries and handle other complex tasks. The Graph has already introduced AgentC, showcasing the convenience of agent services.
2.3.1 AgentC as an example
Soruce: AgentC
AgentC can be likened to a tool similar to ChatGPT but built upon the data from decentralized exchanges indexed by The Graph. For researchers like myself, AgentC proves to be an invaluable tool, primarily because it eliminates the complexity of analyzing on-chain data. Instead, it leverages The Graph's data to provide the information I seek in a straightforward manner. As demonstrated in the provided image, I simply asked which decentralized exchange had the highest trading volume over the past seven days and what that volume was. AgentC promptly queried this data and provided the name of the exchange and the trading volume converted into USD. Although it is still a demo and currently limited to data from about seven top DEXs, the potential is clear. If it expands to encompass data from all chains and applications supported by The Graph, it will become an excellent analytical tool for researchers like myself.
Over the past four years, The Graph has been indexing and querying data from various chains, including Ethereum, establishing a robust infrastructure capable of supporting a wide array of tools. AgentC is a prime example of leveraging The Graph's data combined with AI, showcasing how AI can make previously cumbersome on-chain data easily accessible. The era of user-friendly on-chain data utilization is on the horizon, thanks to AI and The Graph.
AI has already deeply infiltrated our daily lives, and it is poised to significantly enhance blockchain services by resolving many UI issues that users currently face. Given the inevitable future where AI is extensively utilized on-chain, we must take The Graph more seriously. This is not only because AI needs The Graph to access and learn from on-chain data, but also because The Graph is evolving into an infrastructure capable of hosting AI. While services like AgentC are currently in the demo stage, envisioning a future where AgentC leverages data from all subgraphs on The Graph network is tantalizing. For researchers like myself, this would be an indispensable tool. The potential synergy between AI advancements and the vast data indexed by The Graph is something to eagerly anticipate.
The significance of The Graph in the AI era is apparent from the demand side. However, it is equally important to examine The Graph from the supply side, particularly its sustainability. Despite being over three years old, The Graph network has consistently demonstrated sustainability, setting a benchmark for other token-based networks. How has The Graph managed to create a sustainable protocol?
Firstly, The Graph has generated consistent market demand by offering essential services without artificially inflating demand. Secondly, in a market where finding product-market fit (PMF) is challenging, The Graph has not only identified its PMF but also continually enhanced it, rolling out various initiatives to improve protocol performance and ensure customer satisfaction. Let’s explore these two aspects in detail.
(The above graph shows the number of queries processed monthly by The Graph. The sudden increase in queries is because most queries were previously handled by centralized entities, but after the Sunrise initiative, the decentralized network started processing them. Sunrise is an initiative for decentralization, and after its implementation, there was a surge in query volume. The grey area represents predicted future queries, calculated based on the daily average queries from July 1st to the present, as July is not yet complete.)
As mentioned briefly in the introduction, airdrops have been considered essential for launching new projects in the crypto space. For projects, issuing tokens and distributing them as rewards to users was an effective tool to create network effects without marketing expenses, aligning the interests of early users with the network, encouraging continuous contributions.
Points have played a similar role to airdrops. However, points, being non-tokens, offered flexibility in application and served as a good tool to build user anticipation without a fixed token conversion rate.
Yet, as with anything excessive, the primary purpose of airdrops and point systems has become diluted. Originally intended to reward early network participants and align their interests, over time, people began participating solely for airdrops and points, causing protocol usage to drop once the airdrops ended. This led to skepticism about the sustainability of airdrops and point systems.
While attracting initial users with airdrops is important, the most crucial aspect is whether the project genuinely provides a needed solution and clearly addresses a problem. Airdrops and points work well for projects with good products that need marketing and networking to gain recognition.
What constitutes a good product? One that solves problems. In this regard, The Graph is a prime example.
Despite rewarding early contributors with tokens, the amount of data queried through The Graph continued to increase post-airdrop. This indicates that the demand for the protocol wasn’t artificially created by the airdrop but was a genuine reward for contributions.
The reason The Graph maintained steady demand post-airdrop is simple: it provided a necessary service in data indexing and querying.
With ongoing discussions about airdrops and point systems, it’s worth considering The Graph’s approach. Airdrops themselves aren’t bad, but replicating existing services and using airdrops to attract users should be criticized. While airdrops are effective marketing tools, they are
3.1.1 Expanding Outside Ethereum
Originally, The Graph started with Ethereum, but it has since expanded to index and query data from numerous Layer 1 chains beyond Ethereum. Currently, The Graph supports a wide array of networks, including Arbitrum, Polygon, Avalanche, Solana, Binance Smart Chain, NEAR Protocol, Celo, Harmony, Arweave, Cosmos, Osmosis, zkSync, and Base, which represent the leading chains in the web3 space. In just three years, The Graph has expanded its support to over 50 blockchains. Naturally, the more chains The Graph supports, the more query requests it will receive, increasing its usage. This growth trajectory suggests a positive future for The Graph.
The Graph is a protocol, so understanding its business model requires an understanding of its tokenomics. The GRT token is used as a medium of value exchange among network participants (indexers, curators, developers, and delegators). Among these, indexers, curators, and delegators receive GRT, while developers pay GRT. The business model is simple: the more developers pay for queries, the more revenue the network generates. Before sunrise, fees were minimal due to centralized hosting, but with the Sunrise initiative, full decentralization currently increases developer fees and network revenue.
Sunrise marks the complete transition of all subgraphs from the hosting service to a fully decentralized network. As of the article’s writing, over 6,000 subgraphs have successfully migrated to the decentralized network through the Sunrise initiative.
Sunrise of a Profitable Future
Post-Sunrise Initiative, The Graph’s profitability is expected to improve significantly (evidenced by a 376% year-over-year increase in data service fees over the past 30 days, indicating a positive trend in profitability). This improvement is because, under the hosted service, the network’s fee revenue was nearly zero. With the network processing over a billion queries monthly, transitioning to a fully decentralized network not only enhances The Graph’s profitability but also increases the demand for GRT, the core currency of The Graph ecosystem. This transition will have a highly positive impact on the entire ecosystem.
What makes The Graph impressive is its commitment to simultaneously decentralize the network and continuously advance indexing technology to enhance both decentralization and performance. Two key technologies for improving performance are Firehose and Substreams.
3.3.1 Firehose
Source: the Graph Doc
Firehose is a technology developed by one of the core developers of The Graph, StreamingFast, for The Graph Network. This technology enables the processing of blockchain data with remarkable efficiency and speed. Firehose directly retrieves data from specially instrumented blockchain nodes, allowing for real-time data access and processing while significantly reducing latency.
Written in the Go programming language, Firehose supports parallel computing, which contributes to its high performance. Through Firehose, subgraphs can receive data more efficiently, resulting in unprecedented data processing speeds. This capability enhances the overall performance of The Graph, enabling faster and more efficient data delivery and analysis.
3.3.2 Substreams
Source: the Graph doc
While Firehose focuses on the performance of data storage and retrieval, Substreams focuses on the processing of that data, while emphasizing interoperability across various ecosystems. Substreams allows data to be fetched from multiple blockchains and processed using Rust functions, which can then send the processed data to any destination.
The Graph actively leverages these two technologies to make blockchain data indexing and querying significantly more efficient and faster. This strategy aims to encourage more applications to use The Graph. Rather than being content with the moat of being a first mover, The Graph is enhancing its services to solidify its position as a leading data infrastructure provider.
The blockchain market often tends to focus more on "what looks promising" rather than "what is truly necessary." Products that address problems merely riding the hype wave of the market frequently garner high value and expectations, regardless of whether the problems they tackle are genuinely pressing or if their solutions are effective.
However, The Graph has consistently identified and strived to solve clear, essential problems over the years. When assessing the importance of a protocol, one useful approach is to imagine the impact of its sudden absence. If The Graph were to disappear today, countless applications that send hundreds of thousands of queries daily would experience significant disruptions, causing considerable inconvenience for users. Unbeknownst to many, we rely heavily on the services provided by The Graph. Therefore, it is worth studying The Graph to appreciate its contributions. Continuous attention to such protocols ensures that the market focuses on "problems that genuinely need solving" rather than on mere "hyped-up issues."
The Graph aims to evolve from a simple indexing protocol into a comprehensive data service protocol. With the visible potential of integrating AI with blockchain, The Graph aspires to become the critical infrastructure bridging these two technologies. Additionally, The Graph is transitioning into a fully decentralized data protocol. This transition is not only crucial for decentralization but also highlights the growing importance of the GRT token within The Graph ecosystem.
Though it has been three years since the launch of The Graph network, it feels like only the beginning. The past three years have been a preparatory phase for becoming a superior data service. Now, The Graph is ready to establish itself as a fully decentralized data service infrastructure. With on-chain data becoming more critical than ever, The Graph's growth is highly anticipated and promising.