Indexing Solana with Subsquid
Reasons Solana is great:
- At least 1 bald developer (bullish)
- Founder likes dragon costumes (it’s the year of the dragon)
- So fast, you can get liquidated faster than saying f*ck
- Home of DePIN - and we all know there are more IoT devices in the world than humans, so this might be the path to adoption by billions (of devices)
- Memecoins wif and without hats
- Solana Summer sunglasses
(For a more serious rundown, you can go here)
The one thing that wasn’t great until recently is the indexing situation. While there was a partnership with arweave to store the state and a big Google Table, one could query for Solana transaction history; it was far from a perfect set-up in a decentralized, permissionless world.
Enter the brave new world of decentralized indexing. Unlike in Huxley’s novel, you won’t even need drugs; all you need is the Subsquid SDK.
Whether you’re a developer trying to build the next big meme with a hat or an analyst trying to figure out what the bald guy has invested in, Subsquid allows you to access raw data and make sense of it. How you use the data is completely up to you; we just provide the tools.
We do. Now go, get it.
Test Project
You can find the readme here.
In this brief article, we will explore how to index transactions on Solana. However, there are also other indexable data with Solana-specific organizations, which we will go into in future posts.
For general Web3 developers who may be new to Solana, we will begin by discussing the differences between Ethereum's and Solana's data structures. If you would like to get started with indexing, skip ahead to the tutorial section just afterward.
Solana vs Ethereum
Ethereum transactions are straightforward, consisting of essential fields such as the sender and receiver addresses, a signature confirming the sender's authorization, a nonce to indicate the transaction's order, the amount of ETH transferred, optional input data, and gas-related parameters.
Most transactions on Ethereum involve interactions with contract accounts, typically written in Solidity. These contracts utilize the input data field, interpreting it according to the Ethereum Application Binary Interface (ABI) for executing contract-specific operations.
In contrast, Solana's transaction structure is more complex, designed to support multiple instructions within a single transaction. This complexity allows Solana transactions to closely resemble a series of Ethereum internal calls (traces) within one overarching transaction. A Solana transaction begins with an array of signatures, ensuring authenticity and authorization. The core of a Solana transaction is its instructions component, which details the operations to be performed. Each instruction specifies a program to execute the operation, the accounts involved, the operation's data, and any inner instructions akin to Ethereum's Cross Program Invocations (CPIs). This structure supports intricate transaction flows within a single Solana transaction.
Comparing decentralized finance platforms, Uniswap on Ethereum and Orca on Solana each adapt to their underlying blockchain's transaction model. Uniswap leverages Ethereum's transaction model to facilitate swaps and liquidity provision within its Automated Market Maker (AMM) framework. Orca's Whirlpool, a Solana-based AMM, introduces concentrated liquidity with added yield farming mechanics, mirroring Uniswap v3's functionality but tailored to Solana's architecture. Uniswap pairs are managed through individual addresses, while Orca pairs are identified by program IDs. Data organization differs significantly; Ethereum's architecture simplifies accessing swap amounts and token addresses, whereas Solana requires decoding this information from the transaction's inner instructions.
On Ethereum, a submitted transaction includes the following information:
- from – the address of the sender that will be signing the transaction. This will be an externally owned account, as contract accounts cannot send transactions.
- to – the receiving address (if an externally-owned account, the transaction will transfer value. If a contract account, the transaction will execute the contract code)
- signature – the identifier of the sender. This is generated when the sender's private key signs the transaction and confirms the sender has authorized this transaction
- nonce - a sequentially incrementing counter that indicates the transaction number from the account
- value – the amount of ETH to transfer from sender to recipient (denominated in WEI, where 1ETH equals 1e+18wei)
- input data – an optional field to include arbitrary data
- gasLimit – the maximum amount of gas units that can be consumed by the transaction. The EVM specifies the units of gas required by each computational step
- maxPriorityFeePerGas - the maximum price of the consumed gas to be included as a tip to the validator
- maxFeePerGas - the maximum fee per unit of gas willing to be paid for the transaction (inclusive of baseFeePerGas and maxPriorityFeePerGas)
The vast majority of transactions access a contract from an externally-owned account. Most contracts are written in Solidity and interpret their data field in accordance with the application binary interface (ABI)
Instructions in Solana convey data and command execution to on-chain programs, serving as the building blocks of transactions. Here's a breakdown of the components within an instruction:
- Executing Account: Identified by the executing_account field, this is the program ID of the called program, utilizing Solana's base58 encoding for addresses.
- Account Arguments: The account_arguments array lists all accounts involved in the instruction. Unlike the labeled approach you might see in other systems, Solana places these accounts in an unlabeled array, though they are essential for interactions across the instruction's execution.
- Data: Encoded in base58 within transactions and decoded into a byte array for ease of use, the data field specifies the function call and parameters.
- Inner Instructions: Representing program-to-program calls, known as Cross Program Invocations (CPIs), inner_instructions capture the sequential calls made by programs during the transaction's execution.
It's important to note that inner instructions follow the same structure but do not contain their own inner instructions, simplifying the nesting of calls to a sequential list.
Querying Instruction Data
Subsquid’s Solana processor architecture allows for complex queries, such as instructions[1].inner_instructions[2].data, enabling deep dives into the layers of transaction instructions. This model provides a granular look at the interaction between programs within a single transaction.
Indexing Orca Whirlpool AMM
Indexing Orca's new AMM contract begins with initializing the processor and configuring it to observe a specified range of blocks. By setting parameters like the program ID and instruction signature, we set up our processor to specifically capture transactions related to Orca Whirlpool.
Next, lets dive into decoding the data.
const dataSource = new DataSourceBuilder()
// Provide Subsquid Network Gateway URL.
.setGateway('<https://v2.archive.subsquid.io/network/solana-mainnet>')
// Subsquid Network is always about 1000 blocks behind the head.
// We must use regular RPC endpoint to get through the last mile
// and stay on top of the chain.
// This is a limitation, and we promise to lift it in the future!
.setRpc(process.env.SOLANA_NODE == null ? undefined : {
client: new SolanaRpcClient({
url: process.env.SOLANA_NODE,
// rateLimit: 100 // requests per sec
}),
strideConcurrency: 10
})
// Currently only blocks from 240_000_000 and above are stored in Subsquid Network.
// When we specify it, we must also limit the range of requested blocks.
//
// Same applies to RPC endpoint of a node that cleanups its history.
//
// NOTE, that block ranges are specified in heights, not in slots !!!
//
.setBlockRange({from: 240_000_000})
//
// Block data returned by the data source has the following structure:
//
// interface Block {
// header: BlockHeader
// transactions: Transaction[]
// instructions: Instruction[]
// logs: LogMessage[]
// balances: Balance[]
// tokenBalances: TokenBalance[]
// rewards: Reward[]
// }
//
// For each block item we can specify a set of fields we want to fetch via `.setFields()` method.
// Think about it as of SQL projection.
//
// Accurate selection of only required fields can have a notable positive impact
// on performance when data is sourced from Subsquid Network.
//
// We do it below only for illustration as all fields we've selected
// are fetched by default.
//
// It is possible to override default selection by setting undesired fields to `false`.
.setFields({
block: { // block header fields
timestamp: true
},
transaction: { // transaction fields
signatures: true
},
instruction: { // instruction fields
programId: true,
accounts: true,
data: true
},
tokenBalance: { // token balance record fields
preAmount: true,
postAmount: true,
preOwner: true,
postOwner: true
}
})
// By default, block can be skipped if it doesn't contain explicitly requested items.
//
// We request items via `.addXxx()` methods.
//
// Each `.addXxx()` method accepts item selection criteria
// and also allows to request related items.
//
.addInstruction({
// select instructions, that:
where: {
programId: [whirlpool.programId], // where executed by Whirlpool program
d8: [whirlpool.swap.d8], // have first 8 bytes of .data equal to swap descriptor
...whirlpool.swap.accountSelection({ // limiting to USDC-SOL pair only
whirlpool: ['7qbRF6YsyGuLUVs6Y1q64bdVrfe4ZcUUz1JRdoVNUJnm']
}),
isCommitted: true // where successfully committed
},
// for each instruction selected above
// make sure to also include:
include: {
innerInstructions: true, // inner instructions
transaction: true, // transaction, that executed the given instruction
transactionTokenBalances: true, // all token balance records of executed transaction
}
}).build()
Through iterating over blocks and their instructions, we identify transactions matching the Whirlpool program ID and signature. Each relevant transaction initiates the creation of an Exchange entity, encapsulating critical data such as transaction ID, slot, and timestamp.
for (let block of ctx.blocks) {
for (let ins of block.instructions) {
if (
ins.programId === whirlpool.programId &&
ins.d8 === whirlpool.swap.d8
) {
let exchange = new Exchange({
id: ins.id,
slot: block.header.slot,
tx: ins.getTransaction().signatures[0],
timestamp: new Date(block.header.timestamp),
});
The intricate process of decoding inner instructions reveals the necessity of understanding Solana's nested data structures. By verifying the length of inner instructions and decoding them, we extract source and destination transfer details, alongside the associated token balances. This step is crucial for accurately capturing the information about swap transactions
The complexity of Solana's nested data structures contrasts with the more linear transaction model of Ethereum. However, this complexity allows for a more dynamic interaction model within transactions, enriching the ecosystem with multifaceted transaction patterns.
Here we check if the instruction has correct signature and create an exchange entity.
assert(ins.inner.length == 2);
let srcTransfer = tokenProgram.transfer.decode(ins.inner[0]);
let destTransfer = tokenProgram.transfer.decode(ins.inner[1]);
let srcBalance = ins
.getTransaction()
.tokenBalances.find(
(tb) => tb.account == srcTransfer.accounts.source
);
let destBalance = ins
.getTransaction()
.tokenBalances.find(
(tb) => tb.account === destTransfer.accounts.destination
);
After checking that inner instructions have the correct length, we unpack inner transfer instructions and initialise srcTransfer and destTransfer respectively.
We retrieve source and destination balances from the transaction included with the inner instructions. We also verify that token balance account is equal to source and destination accounts from the inner transfer instructions we retrieved before.
Next we get source and destination mints from the source and dest balances respectively.
let srcMint =
srcBalance?.mint ||
ins
.getTransaction()
.tokenBalances.find(
(tb) => tb.account === srcTransfer.accounts.destination
)?.mint;
let destMint =
destBalance?.mint ||
ins
.getTransaction()
.tokenBalances.find(
(tb) => tb.account === destTransfer.accounts.source
)?.mint;
As you can see, compared to Ethereum there is a lot more nesting of data going on due to Solana’s complex data organisation and execution flow.
assert(srcMint);
assert(destMint);
exchange.fromToken = srcMint;
exchange.fromOwner =
srcBalance?.preOwner ||
srcBalance?.postOwner ||
srcTransfer.accounts.source;
exchange.fromAmount = srcTransfer.data.amount;
exchange.toToken = destMint;
exchange.toOwner =
destBalance?.postOwner ||
destBalance?.preOwner ||
destTransfer.accounts.destination;
exchange.toAmount = destTransfer.data.amount;
exchanges.push(exchange);
Finally, we unpack the remaining pieces of data from the dest and source balances and save the exchange.
On a higher level, the process is similar to indexing, for example, Uniswap. The SDK provides a familiar experience, making sure that switching to the new network architecture is seamless. At the same time, we ensure that the Solana specifics are handled correctly.
Solana, with its unique approach of allowing multiple contract calls in a single transaction, offers a treasure trove of data and flexibility that stands apart from the familiar territory of Ethereum's EVM. This isn't just about technical prowess; it's a testament to the evolving narrative of blockchain's potential.
Indexing Orca Whirlpool is a great example of using Subsquid’s new SDK to bring top-tier indexing to Solana. It showcases the power of Subsquid’s toolkit in making sense of Solana's complex and data-heavy structure. This effort ensures that the decentralized finance landscape remains a place where transparency and accessibility aren't just ideals but realities across all networks.
We’re looking forward to seeing what developers on Solana will build with Subsquid. If you have feedback or questions for us, don’t hesitate to get in touch!