在之前关于以太坊存储的系列博文中(见最后的完整文章列表),我们解释了以太坊存储对于具有固定大小的变量的简单合约是如何工作的。
在这篇博文中,我将继续解释像map和array这样的动态大小的变量是如何工作的,我们还将通过现实世界的例子来解释和测试案例来验证公式。
动态大小的数据结构
以太坊在32字节宽的槽中存储值。槽是由一个uint256数字来索引的。
对于静态大小的变量,它们是按顺序存储在槽中的。
注意,槽位索引不是一个全局索引,而是在智能合约内的相对索引。每个合约从槽位0开始依次存储变量(槽位索引为0)。
然而,对于像地图或数组这样的动态大小的数据结构,其项目的存储位置不能按顺序存储在槽中。相反,项目的槽位是根据地图中的项目键或数组中的项目索引,使用一些公式计算出来的。
让我们逐一分析。
地图中键值对的槽位索引
对于地图来说,每个项目(键值对)的槽位索引的计算方法是:。
- (keyN, valueN) is the key value pair in the map variable
- ++ means string concatenation.
- M is the slot index for the map variable.
- keccak is the hash function
- The following diagram shows how the key-value pair is stored in slots.
Let’s look at a real world example and verify the above formula.
The USDC contract uses a map variable to store each USDC holder’s balance. The key is the USDC holder’s address, the value is a uint256 value as the holder’s USDC balance:
How to get the storage location for a certain account’s USDC balance? For instance the USDC balance of account 0x467d543e5e4e41aeddf3b6d1997350dd9820a173. Let’s just call this account, account K.
According to the formula, in order to know the slotIndex for the this account, we need to know the map key, which is the account address 0x467d543e5e4e41aeddf3b6d1997350dd9820a173.
And we need to know the slot index for the map variable, which is 9. How do we know that? I will explain later.
With these data, let’s compute the slot index for the account’s USDC balance:
Note that both the key and the slot index for the map variable needs to be 32 bytes long left padded with zeros before passing to the keccak hash function.
The computed slot index for account 0x467d543e5e4e41aeddf3b6d1997350dd9820a173is 0x4065d4ec50c2a4fc400b75cca2760227b773c3e315ed2f2a7784cd505065cb07.
With the slot index, we can query the proof from a full node using eth_getProof method and verify it:
- 0xa0b86991c6218b36c1d19d4a2e9eb0ce3606eb48 is the USDC contract address.
- 0x4065d4ec50c2a4fc400b75cca2760227b773c3e315ed2f2a7784cd505065cb07 is the slot index of the USDC balance of account 0x467d543e5e4e41aeddf3b6d1997350dd9820a173.
With the Merkle proof, we can verify it by creating a Merkle trie in local and traversing the path from the root node with contract storage root hash, which is included in the proof, all the way to the leaf node. We have explained this in previous blog post. The detail can be found in the source code linked at the end of this blog post.
The storage value is 140080592961691. After verifying the proof, if it’s valid, then we know the USDC balance, which is 140,080,592.961691 USDC token at block 15245000 (0xE89EC8).
Finding the slot index for a map variable
OK. Let’s go back to the previous question, how did we know the slot index for the map variable balances is 9?
We could use a brutal force approach to find it.
Since contract variables are stored in each slot sequentially, we could iterate through each slot index starting from 0. For each slot index, we assume the target map variable is stored in that slot, and use the formula to get the slot index for the balance of an account, and then use the eth_getStorageAt method to get the value stored at that position within the contract.
If the slot index is incorrect, then it will return 0, meaning nothing is stored at that location, and we should try another slot index.
If the slot index is correct, then it should return the balance, and we can simply verify it against the balanceOf interface method call.
This is one way to find slot index for a state variable. Although not very efficient, it’s probably enough, since a smart contract can’t define too many variables. And you only need to iterate through once, once you have found the slot index, you can save it to use for next time, since smart contract can’t be updated after deployed.
OK, we’ve explained how dynamic sized variable like map stores its items (key-value pairs). Next let’s take a look another dynamic sized data structure — Array.
Slot Index for array item
For array variable, the slot index for each array item is calculated as:
- A is the slot index of an array variable
- itemIndex is the index of the item.
- itemSize is the number of word (32 bytes) each item takes.
- The following diagram shows how array items are stored in contract slots, if each array item is 64 bytes long, which takes 2 slots to store:
And the array length is stored at the slot index of the array variable, and then all the array items are stores sequentially from the slot at the hash of the slot index of the array.
Let’s look at another real world example and verify the above formula.
This time, we will dive into the popular NFT contract — CryptoKitties.
CryptoKitties smart contract uses an array variable, called kitties, to store the meta data for each crypto kitty:
We will explain the Kitty data structure later.
The slot index for the kitties variable is 6, which can be found using the same approach I explained earlier.
Now we can query the kitty count, which is the length of the kitties array from slot 6 using eth_getProof.
- 0x06012c8cf97BEaD5deAe237070F9587f8E7A266d is CryptoKitties’s Core contract address, which stores the kitties variable
- 0x6 is the slot index of kitties in hex format.
It returns the storage value as 0x1eb82c, meaning there are a total of 2013228 kitties at block 15244590 (0xE89D2E).
Storage Location for each crypto kitty’s meta data
Now let’s look at how to get each kitty data from the kitties array.
Kitty is a struct that takes two words, meaning, each kitty data will take two slots to store. The first slot stores the kitty genes, which is a uint256 number, and the second slot stores the rest of the kitty’s properties:
Since each array item is stored sequentially starting from the first item, we can calculate that Kitty 1, for instance, will be stored at slot index: 1.
- 0000000000000000000000000000000000000000000000000000000000000006 is the slot index for the kitties variable in hex format left padded with zeros
- 1 is the kitty 1’s index in kitties array;
- 2 is the size of the Kitty struct.
- + is the number addition operation.
So the slot index for Kitty 1 is 111414077815863400510004064629973595961579173665589224203503662149373724986689 , a very big number.
With the slot index, we can request proof for the storage data the same way as for the USDC account balance example explained earlier:
- 0x06012c8cf97BEaD5deAe237070F9587f8E7A266d is CryptoKitties’s Core contract address.
- 0xf652222313e28459528d920b65115c16c04f3efc82aaedc97be59f3f377c0d41 is the hex format of the slot index for Kitty 1.
- 0xE89D2E is block 15244590 .
The above query should show the storage value for Kitty 1 at the first slot is 0x5ad2b318e6724ce4b9290146531884721ad18c63298a5308a55ad6b6b58d. This is the hex format of the Kitty 1’s genes — 626837621154801616088980922659877168609154386318304496692374110716999053, which is a uint256 number.
If we use CryptoKitties’ getKitty method to query Kitty 1’s genes on Etherscan, it should return us the same number.
Summary
In this tutorial, we explained how dynamic sized variable is stored in smart contract, such as map and array. And how to query the contract variable from storage and how to verify the proof for the storage value.
This is the last blog post of the full series about Ethereum storage. This is the full list of blog posts:
- Merkle Patricia Trie Explained
- Verify Ethereum Account Balance with State Proof
- EIP 1186 — the standard for getting account proof
- Verify Ethereum Smart Contract State with Proof
- Verify USDC Balance and NFT Meta Data with Proof
References
Source code
The source code of the above code and test cases are open sourced as below:
- Test cases for verifying USDC balance of an account with proof
- Test cases for verifying CryptoKitties Meta data with proof