SQD Data Talk: Insights on Building an RPC Proxy

Earlier in May, we went live for the first episode of our new series: SQD Data Talks.

We’ve been data nerds ever since our founders came across the problem of accessing blockchain data, but realized we don’t talk about it enough.

That’s why we decided to open some of the conversations that are usually confined to our internal channels to the public and start discussing all things blockchain, data, and open-source with people who’re just as deep into the topic as we are.

Our first stream featured Aram, contributor to eRPC, an RPC proxy used by many well-established crypto projects such as Pimlico. He’s also the creator of the Blockchain Data Standards Group, where developers discuss how blockchain data can be improved.

Read on for the insights from our conversation, or watch the recording of the talk here.

On why Aram got into crypto

For Aram, crypto is much more than just a nice-to-have. Coming from Iran, he shared a belief in the Western ideals of creating a better world and eventually left Web2 to get into blockchain, as he considered it a duty toward his family, for whom, as all other people under sanctions, crypto is the only hope of accessing a financial system.

How eRPC came to be

When Aram and his team started writing smart contracts in solidity to develop apps, they quickly realized that retrieving data took a lot more time than they had anticipated.

“Accessing the data took even longer than writing the smart contract itself.”

In search of a solution, they were introduced to the Graph and SQD. Nevertheless, the biggest bottleneck remained the actual node and the RPC under the hood. That’s when they took matters into their own hands and built an RPC proxy to save money and improve resilience.

They released it as an open-source project, and eRPC was born. As more people became interested in and started using their framework, the team shifted its focus to dedicate itself fully to eRPC.

How it feels to use eRPC to manage RPCs

Aram points out that one core contributing feature to their success, and the general need for reliable data access in crypto is that, unlike in Web2, where companies store data on their own servers, in Web3, most critical data is stored outside of the company’s direct control: on public ledgers, distributed across nodes. He foresees that this problem will only grow in importance.

Why use an RPC proxy in the first place?

Every multichain dApp will eventually need to figure out how they access all the data they need across different apps. Instead of running nodes on all the chains, a common move is to contact third-party providers that provide an RPC endpoint.

However, one might not be enough if you want to provide high uptime, resiliency, and hedge failure. As soon as you contract more than one new questions arise:

  • How do you decide which one to use?
  • How do you go about load balancing?
  • Which latest block do you refer to when both show different results?

Web2 solutions fail in this scenario because they aren’t optimized for things such as differing final blocks or forks. To make decisions, you need a smart proxy, and that’s what eRPC offers. It’s a solution that ensures that the app is online, even when one or more (depending on how many are used) providers are down, while making the best decision on a best-effort basis.

Will RPCs stay the main way to access data?

RPC is great because it is a standard and widely adopted, yet it isn’t because it is unusable for querying any kind of data and provides no filtering abilities whatsoever. Whether JSON RPC remains the go-to solution, though, depends much more on demand than what’s there.

Aram thinks that hackers, and small innovators that don’t want to rely on anyone else will continue running their own nodes and use RPC. The growing crowd of companies that are revenue-driven and focused on providing great DX and UX however will look to find reliable data sources while making it less of a priority to decentralize the whole stack.

In this group, people are used to working with raw data, and have in-house data engineers that’ll build their data pipelines from scratch.

It do be like that

Subgraphs in a high tps world

Earlier on, subgraphs were the only game in town, which explains how their market share grew so quickly. When SQD first launched, teams frequently wanted to migrate entire Subgraphs to our network. However, this has changed.

Subgraphs, as a technology built to fit the needs of its time, were a fit for the data demands back then. However, as higher throughput, parallel processing blockchains emerged, their sequential processing didn’t work anymore.

Aram, too, points out it’s time to rethink the data layer, and he has observed a pivot away from sequential processing to anything that allows distribution and building of data pipelines.

“Subgraphs are still a de-facto standard, but in the next 5 years they’ll probably not be anymore due to data growth.”

How to fund open-source data access?

One challenge for all decentralized data networks has been continuously funding the peers in their networks without running out of funds. Dmitry points out that the Graph spends up to $9 million in rewards to earn a few 100k at best in queries on the demand side. That obviously isn’t a sustainable long-term model. But what could be?

For Aram, willingness to pay for data access comes down to how reliable one needs the data to be. Oracle data, in that regard, is more valuable, as it comes with a higher necessity for trustworthiness. For public data, in case one provider goes down, projects can easily pivot to another. In the case of an Oracle failure, there might not be much left to transition.

What’s clear to him is that the current paying for querying model won’t scale. In the end, providers will compete on offering the data the best way, without adding overhead and focusing on delivering real value to consumers.

What’s the Web3 data standards group about?

When working on eRPC, Aram and his team noticed that there are multiple layers in web3 data.

While layer 1 & 6 are out of scope, the people in the Blockchain Data Standard Group have come together to introduce standards for the layers 3 -5, including querying, schemas and transport.

If you’re interested in contributing, join their Telegram group. You can also read more about their mission on GitHub.


Thanks again to Aram from eRPC. If you’re looking for an RPC proxy to manage multiple providers, check them out.