Why leverage blockchain data?

When looking at the most successful companies in web2, you'll quickly realize that they all derive their edge in one form or another from data. Whatever we do online is turned into data and stored in centralized, private databases that can only be accessed via commercial APIs or not at all. 

Blockchains and web3 do the polar opposite. At their core, blockchains are just databases. As such, they store everything that has ever happened on them. 

Yet, despite vibrant discussions around building apps for mass adoption, trading, and scaling monolithic chains, the potential of blockchain data often goes overlooked. 

In this post, we'll explain what blockchain data is, outline ways to access data for individuals and projects and provide an outlook on how blockchain data can catalyze future growth. 

What is Blockchain data? 

Blockchain data, also referred to as on-chain data, provides a birds-eye view of what's happening on a blockchain. It's the opposite of off-chain data that lives outside of blockchains. Blockchain data describes all the publicly available information regarding transactions, wallets, and activity. Data stored onchain is tamperproof and guaranteed to be in agreement with the consensus rules of the underlying chain, making it highly attractive for analysis and business purposes. 

Information one can retrieve from chains include: 

  • Wallets: how many wallets have been created on a certain blockchain, and how active are they? 
  • Transactions: how many transactions happen on a chain, what times do they occur, and whether they are by real users or bot traffic? What are the average gas fees users pay? 
  • Decentralization: How many nodes are in the ecosystem? Which validators validate transactions, and how is the distribution of funds/power? 

Because blockchains are open and public networks, they allow for real-time analysis of what is happening. Onchain data is useful not only to gain a better understanding for traders but is also a fundamental building block for anyone looking to develop an app or service in web3. 

Using Blockchain Data 

src: Screenshot Etherscan

As a savvy crypto user, chances are that you already have some experience using block explorers like Etherscan to see the status of your transactions or when trying to figure out who sent you an NFT. 

Nevertheless, while Block Explorers are a great tool to check on single transactions, they do not provide a great tool to gain insights into trends, nor do they make it easy for projects to use onchain data. 

Blockchain data isn't provided in an easily digestible format, nor is there any standardization across chains on naming functions or setting up smart contract structures. The entire process of retrieving and analyzing blockchain data is further complicated by the need to download the whole chain. Considering that Ethereum's blockchain now measures 658 GB and Bitcoin stands at 524 GB, it's not something you could quickly do with your phone.

For individuals and projects looking to use blockchain data to inform decision-making, fact-check statements such as "XYZ is the chain with the most active wallets" or find out what tokens their favorite influencer is holding Data Analytics platforms offer an easy way to find answers, without the hassle of retrieving the data yourself. 

Blockchain Data Analytics 

Popular blockchain data analytics tools that'll often find their way on the X timelines are: 

  • Dune Analytics: an analytics platform that simplifies the process of querying blockchain data, allowing users to use SQL queries to extract information from pre-populated databases. Data is then displayed in graphs or dashboards that can be easily understood and shared. More recently, they also added the ability to query with natural language. 
  • Nansen is another analytics platform that allows users to access data from various blockchains. With Nansen, one can quickly gain an idea of the health of an ecosystem, learn more about interactions on specific smart contracts, and track wallets. By adding labels to wallets, Nansen provides further context to what's happening onchain. 
  • Glassnode: offers a comprehensive library of on-chain and financial metrics for popular crypto assets. It's built for institutional needs, aiding quantitative trading, risk management, and more. 
  • Chainalysis: caters to institutions, governments, and businesses looking to establish compliance with AML/KYC requirements and offers analytics tools to track down hackers. 

All the above are dedicated to analyzing data, aggregating information, and presenting it in digestible ways. However, gaining insights into what's happening onchain is just one use for data. Looking beyond analytics, any project building web3 products needs onchain data. 

Wallets need to be able to display the correct balance and show the status of transactions. NFT marketplaces tap into onchain data to show users their portfolio, while games rely on onchain data to render in-game characters. 

As mentioned previously, blockchain data isn't standardized. Functions are named differently across chains, and retrieving onchain data directly requires a deep understanding of the underlying protocols. Fortunately, projects building don't have to build their own solutions from scratch because they can use indexers. 


Indexers crawl through blockchains and scan every single block. They then store the information in a consistent, easy-to-query format. The Graph and Subsquid are both indexing protocols flexibly designed to act as a foundation on which developers can build. 

Subsquid, especially, is set up entirely open-source with complete transparency around its inner workings and support for the multichain future. 

Indexers act as data aggregators and middleware that sit between the underlying blockchain(s) and dApps built on top. And this is where the real magic happens. 

The hidden catalyst

There are still people explaining that all we need for broader web3 adoption is better wallet UX, cheaper transactions, and education. Yet, with plenty of affordable block space available that is barely being used, that cannot be the whole story. 

Src: Lattice Blog

Maybe the real problem is that building on top of blockchain is hard because data is complex to retrieve, especially across chains, or expensive, hindering small teams from even trying. Indexers, acting as middleware between the blockchain and applications, can unify the data and allow developers to mix and match the tasks they want to accomplish. 

When devs can stop worrying about understanding five different base layers and the data structures on 10+ L2s, they will have more time to focus on building compelling experiences. 

Ultimately, one of the biggest benefits of blockchain is that data is now publicly available and open to anyone removing the silos we've seen in web2. By unifying on- and off-chain data, indexers like Subsquid empower devs to create crypto-native use cases and go beyond to building apps for non-crypto users. 

Unified comprehensive data access -> better apps 

Better access to data and better insights from them allow businesses to build better products. A few use cases that are greatly improved with better data access: 


Liquidity is fragmented across chains, but with unified access to data, DeFi protocols can offer users a more holistic experience where they can see all their trades and positions in one interface, with real-time updates on what's happening on-chain. Combined with the rise of intents and AI, blockchain data in DeFi could create experiences superior to any traditional trading.


NFT marketplace benefits from better access to data, enabling them to show collections on different chains, prices, and holistic insights into what's happening with collections. Additionally, NFTs used in games can only seamlessly run if there's constant syncing between in-game and on-chain. Projects might even set up two-way relationships where on-chain activity changes the appearance of the NFT itself (dynamic NFTs). 

Personalization & Recommendations

The stickiness of Web 2 is partly explained by its ability to show us things we like. With access to all the data connected to a user's wallet, platforms can create more personalized experiences catering to niche interests. The privacy of the individual is maintained since wallets aren't directly linked to the natural person behind them. And depending on what wallet one connects with, results could differ, creating an interactive experience. 


Decentralized social graphs like Lens and Farcaster are carving out their niche communities and continue growing. All data on who is following whom and their interests are available on-chain: an attractive playground for devs to build on. Farcaster already boasts a variety of clients catering to different needs, such as images only or showing data in different formats. With permissionless networks, curation will be key to attracting and retaining users. 

LLMs can use data from social graphs to identify trends, curate content, and set reputation scores, which could eventually find their way into DeFi. 

There are countless more examples of how having unified, easy-to-query data powers next-gen crypto use cases

In essence, blockchain data could catalyze a new wave of next-gen dApps thanks to its inherent transparency and accessible nature. From DeFi to DeSo, Web3 use cases are unthinkable without data to fuel them. As blockchains grow in adoption and variety, tools that unify, standardize, and offer access to as many chains' data as possible will be key to powering the next wave of blockchain apps. The only way to build next-gen apps will be to have access to next-gen data tools.

Subsquid Website | Documentation | GitHub | Twitter | Developer Chat

Article by @Naomi_fromhh