Preface

Web3 data was generated on January 3, 2009, when Satoshi Nakamoto mined the genesis block of Bitcoin on a small server in Helsinki, the Netherlands.

In the Web2 world, the development logic of giant companies represented by Google, Meta (Facebook), and Tencent (WeChat) follows a closed monopoly loop: initially, they attract users and third-party developers through an open ecosystem, accumulate potential energy through the flywheel effect (user growth drives up platform value), and then use the Matthew effect to form data hegemony - personal data and social relationships are privatized by the platform. Moreover, the data of each platform is separated, and the social relationships are also broken. The information of users' friends on WeChat cannot be shared on Alipay.

In the Web3 world, the public chain ecosystem represented by Bitcoin, Ethereum, and Solana stores data on the blockchain, with tamper-proof, traceable, and open and transparent features. Not only does it return data sovereignty to individuals, but any individual or organization can freely access this data for personalized analysis. Web3 data is like an ownerless gold mine, and with the development of the Web3 ecosystem and the enrichment of DApp applications, the gold content of this data gold mine will become higher and higher.

Web3 Data Ecosystem

Where there is a gold mine, there are merchants selling shovels, which is the data ecology of Web3. It can be divided into three layers from bottom to top, namely node services, data index services and data applications, which can be compared to the concepts of IaaS, PaaS and SaaS in Web2.

Data Mythology - Overview of Web3 Data Ecosystem

Node Service

In the blockchain network, each account participant is a node. Generally speaking, there are two main types of nodes: one is a full node that stores the complete account book, which ensures the security and correctness of the data on the blockchain by verifying the data; the other is a lightweight node, which is each participating user. Each light node needs to connect to a full node in order to synchronize the current state of the network and be able to participate in the operation.

Although anyone can deploy blockchain nodes to participate in the accounting (consensus) process, there will be bandwidth limitations and hardware performance issues. In order to solve the technical and hardware threshold problems, many blockchain node service providers have emerged. Node service providers deploy blockchain nodes and maintain their functions related to the required blockchain to create efficient and convenient node service capabilities, including RPC interface access services, data access services, and staking services for some PoS chains.

Currently, the more popular blockchain node service providers on the market include ZAN, Infura, Alchemy, QuickNode, Pocket Network, etc.

Data Mythology - Overview of Web3 Data Ecosystem

Founded in 2023, ZAN is the technology brand of Ant Financial's Web3 products and services, dedicated to protecting and optimizing Web3 applications, platforms, and businesses. ZAN node's RPC interface access service provides stable and fast connections to more than 22 popular blockchain networks, making it easy to deploy Web3 projects and DApps. ZAN also provides a variety of enhancements for data retrieval and dynamic monitoring of smart contracts and assets.

Infura was founded in 2016 and was wholly acquired by Metamask's parent company ConsenSys in 2019, becoming its subsidiary business unit. Infura aims to lower the threshold for developers and users to access Ethereum data. Infura is one of the most popular Ethereum infrastructure service providers. Its node management and API management services support functions such as malicious attacks and spam request filtering, thereby ensuring the flexibility and efficiency of nodes.

Alchemy is a blockchain infrastructure company founded in 2017. The company's main business is to provide blockchain development platform services for blockchain developers and is committed to becoming the AWS of the Web3 world. Through Alchemy, developers can quickly connect to and interact with multiple blockchain networks while also gaining strong reliability, data correctness, and resilience, which are very important for blockchain applications.

Founded in 2017, QuickNode is a company dedicated to building a Web3 cloud platform. Its main purpose is to provide efficient and reliable blockchain services to help developers develop blockchain applications more easily. QuickNode provides API services to users by directly providing dedicated nodes, and supports multiple regions, multiple test networks, and archival nodes, so as to provide developers with better blockchain access performance and stronger stability.

The Pocket Network mainnet was launched in 2020. It is a distributed API infrastructure built for Web3 applications, providing a trustless API layer that can easily access any blockchain. The project aims to build a complete distributed network of blockchain nodes. Through the trustless protocol provided by the Pocket API, developers can seamlessly access these nodes and create a DAO ecosystem combined with crypto-economic incentives.

Data indexing service

The essence of a blockchain is a database that stores blocks over time. Querying data for a specific block is fairly simple, but querying specific information or aggregate queries across multiple blocks can be very complex.

For DApps, it is very common to query specific information or aggregate queries across multiple blocks. For example, there is a smart contract that sells e-books. Once a user purchases a book, we want to send a copy to the user via email. This cannot be achieved by smart contracts alone, so smart contracts should not be considered the "backend" of DApps. The real "backend" needs to pull one block from the blockchain per second, and then analyze the transactions contained in the block to check whether there is a transaction that completes the "buy an e-book" function call in our smart contract, and ensure that this transaction is successful, then trigger the business code to send an email to the user with the user's purchase of a copy of the e-book.

This kind of business scenario requires the use of data indexing services. Generally, the official high-performance public chain (such as Near, Aptos, Sui) will provide ready-made open source software for indexing services, which are generally run by centralized node service providers, but there are also decentralized data indexing service providers such as TheGraph. The data indexing service is mainly for DApp developers, providing a faster and more advanced data query interface. Going back to the example of selling e-books, the "backend" of the DApp only needs to query the transactions related to "buying an e-book" from the indexing service, and then analyze and trigger the code in the same way to send an email to the user with the user's purchase of a copy of the e-book, greatly reducing the cost and complexity.

Data Mythology - Overview of Web3 Data Ecosystem

TheGraph was started by three software engineers in late 2017 and launched in December 2020. TheGraph is a decentralized network that provides blockchain data indexing and querying. It organizes complex blockchain information into a format that is easy to retrieve through custom subgraphs, allowing developers to build blockchain data APIs and read data through GraphQL, helping developers create fully decentralized applications.

Data Mythology - Overview of Web3 Data Ecosystem

It is worth mentioning that the domain name service of blockchain can also be classified as a decentralized data index service, such as ENS (Ethereum Name Service). ENS was established in the Ethereum Foundation in early 2017 and operated independently in 2018. It is part of the community's Internet infrastructure. The job of ENS is to resolve readable domain names (such as "alice.eth") into computer-recognizable identifiers, such as Ethereum addresses, content hashes, metadata, etc. ENS also supports "reverse resolution", which makes it possible to associate metadata (such as standardized domain names or interface descriptions) with Ethereum addresses.

Data Application

The main difference between data applications and data indexing services is that data applications are generally aimed at ordinary users (rather than DApp developers), providing tools or products such as blockchain data browsing, data analysis, and market analysis.

Blockchain browsers are one of the most important infrastructures of the public chain ecosystem and are also the most used data applications by users. Although the name is a blockchain browser, it is actually a tool for searching and analyzing on-chain data. The most popular and widely used blockchain browser for Ethereum is the third-party Etherscan. Solana has an official blockchain browser and a commonly used third-party Solscan. In addition, in addition to single-chain blockchain browsers, there are also multi-chain blockchain browsers such as OKLink.

Data analysis tools generally focus on the customization of data or vertical fields, and are generally aimed at professional data analysts or researchers. Commonly used data analysis tools include Dune Analytics, Footprint Analytics, DefiLlama, NFTScan, Chainalysis, etc.

Data Mythology - Overview of Web3 Data Ecosystem

Dune Analytics is a powerful web-based blockchain analysis platform. Through Dune, users can use simple SQL to query on-chain data from public blockchain databases, extract the desired information and visualize it. Dune is a "Google Analytics for the crypto world".

Footprint Analytics is an all-in-one analytics platform for visualizing blockchain data and discovering insights. Footprint allows users to convert raw data tables into charts through an easy-to-use drag-and-drop interface, and users can find the dashboard they need based on topic, chain, or data category.

DefiLlama is a DeFi information aggregator that provides a platform for tracking the growth and activity of the DeFi space with a reliable and transparent source of information. DefiLlama integrates information from various DeFi protocols, allowing users to explore key metrics such as total locked value (TVL), liquidity, and trading volume.

Data Mythology - Overview of Web3 Data Ecosystem

NFTScan is an integrated NFT analysis platform that allows users and investors to keep abreast of the real-time on-chain activities and holdings of various projects and users on the blockchain, and to understand the latest performance of the NFT market based on various data dimensions, and to obtain NFT information at the first time.

Chainalysis is an enterprise data solutions company that monitors and analyzes on-chain data to help clients (such as governments, cryptocurrency exchanges, international law enforcement agencies, and banks) comply with regulations, assess risks, and identify illegal activities. Chainalysis is also known as the "FBI on the chain."

Market analysis tools are tools used to analyze and visualize crypto market data, evaluate market trends and assist investment decisions. They are generally aimed at professional investors. Common market analysis tools include Coingecko, CoinMarketCap, Glassnode, Messari, DeBank, 0xScope, Nansen, Arkham, Dexscreener, Dextools and the recently popular GMGN.

CoinGecko and CoinMarketCap are professional token analysis tools used to observe and track token prices, trading volumes, market capitalization, etc.

Glassnode and Messari are blockchain data and information providers that enable investors to access on-chain data and transaction intelligence from different perspectives.

Data Mythology - Overview of Web3 Data Ecosystem

DeBank is a DeFi portfolio tracker. With DeBank, users can track and manage the DeFi applications they have interacted with in one place. They can also track address balances and changes, asset allocations, authorization status, rewards to be received, loan positions, and more.

0xScope, Nansen and Arkham are all professional on-chain analysis platforms that provide visual on-chain data and can proactively analyze smart wallets, large transfers, project wallets/whale wallets, and on-chain tracking of various hot events.

Dexscreener and Dextools are decentralized exchange (DEX) data stations that allow traders and investors to track and analyze real-time data from multiple DEXs. Through Dexscreener or Dextools, users can easily monitor the price, trading volume, and on-chain transactions of various tokens.

Data Mythology - Overview of Web3 Data Ecosystem

GMGN is a tool website that integrates two major functions: a line viewing website and an on-chain asset dashboard. Compared with Dexscreener and Dextools, GMGN provides faster and more timely data, and supports Pump.fun line chart services. In addition, it also has address tracking, position management, and smart money/KOL wallet exploration and other functions. It is the most popular Solana on-chain tool used by mainstream Degen users.

Summarize

The essence of Web3 emphasizes that users participate in data creation and thus become "shareholders" of data value. This concept subverts the data ownership and value distribution model dominated by giants in Web2, and opens up a new path for the fair use and sharing of data. Therefore, Web3 data has unique value.

In fact, Web3 data has become a core asset. Whether it is token transactions, NFT minting or DApps activities, every detail of the on-chain activities is drawing a full picture of value flow, reflecting market dynamics, user behavior and even the health of the entire blockchain ecosystem.

The Web3 data ecosystem is flourishing, with node service providers providing convenient blockchain access and infrastructure support, data indexing services providing DApp developers with efficient data query and analysis capabilities, data applications providing ordinary users with on-chain data browsing, and providing investors with market analysis.

Web3 data is fertile soil. No matter which layer of the Web3 data ecosystem, it still has huge potential and broad prospects. Arkham in 2023 and GMGN in 2024 are proving this. The author believes that there will be new surprises in the data track in 2025.