“Merry Go Round” sync


Here I propose a new approach to syncing the Ethereum state data that I’m tentatively naming: “Merry-Go-Round Sync”, or MGR sync. MGR aims to provide a new mechanism for syncing the Ethereum state that exhibits bittorrent style swarming behavior, and mitigates against other uses of the state syncing network primatives.

At a high level, MGR operates by enumerating the full state in a predetermined order and gossiping this data among the clients which are actively syncing. For a client to fully sync it needs to “ride” one full rotation of the merry-go-round.

This is aimed at just being a rough write-up. It doesn’t get into specifics in many places, mainly because there are no firm specifics at this stage, only the high level concept and rough ideas of how it would work.

State Enumeration

MGR requires a well defined way to enumerate the state data such that:

  1. Minimize duplication: During a single revolution of the MGR there should be very little or near zero duplication of data.
  2. Zero coordination: There should be no coordination required for clients to know what part of the state is being distributed at any given time.
  3. Flexible with-respect-to state size: For a chain with a very small state, each revolution of the MGR should be very quick. For a chain with very large state, each revolution of the MGR would be longer.

No obvious solution has been posed that fits both of these criteria well. Here is my best idea thus far.

Use the block hash

In this model we would have some concept of a sync epoch. At each epoch boundary, the block hash of the boundary block would determine what location in the trie enumeration should begin at. The state data would be enumerated starting at that location in the trie and working outwards until the next epoch boundary.

Mitigating against re-sending duplicate data when there are two epochs with block hashes that are close together in the trie would require something like tracking and broadcasting what ranges of the trie each client has fully synced.

This approach does not require any coordination for determining where to enumerate the trie, but it does require some level of coordination for clients to keep track of which of their peers has which data. This might be acceptable.

This approach seems to scale well with different state sizes.

One point of note is that under this model there may not be a clear definition of “one revolution” because two clients may be able to ingest data at different rates, meaning that they would take different amounts of time to fully enumerate the trie since the slower client would make less progress during each epoch. This might actually be a good thing.


This likely belongs in a separate protocol than the existing DevP2P ETH. For the sake of simplicity we’ll assume this is a new DevP2P protocol MGR. Currently we would require the benevolent participation of fully synced nodes.

We would need something like the following commands.


Allow clients to broadcast information to each other. This should probably contain forkid as well as meta data about what parts of the state the client has already synced


A data packet that includes a provable chunk of the state. Assuming we use the witness spec, it may be that the “top” level of the tree is only sent around occasionally and that sub-trees are the primary unit of distribution in between re-broadcasting of the top levels of the tree

This message would be gossiped around the network…

Getting a full picture of the state

One problem that arises with any state syncing approach is that the state is constantly changing which makes it difficult to get a full picture of the state within the garbage collection window that most clients use to discard old trie data.

Beam sync addresses this, however, for clients to rely on beam sync we will need witnesses propagating around the network.

A “DH7” style argument for incentives in a stateless network


@pipermerriam wrote:

Original HackMD document

Argument for Incentives

Based on DH7 from: https://www.lesswrong.com/posts/FhH8m5n8qGSSHsAgG/better-disagreement

This is not an actual argument for incentives but rather me trying to understand the argument for incentives. I attempt to make the strongest argument for incentives in a stateless network and in doing so found some useful insight which I want to share. This is mostly a loose stream of thought and has not been wordsmithed.

The Ethereum network has a lot of data.

  • 40gb of account data
  • 100gb of chain data

By their nature, stateless clients are leachers. Beyond simple block verification which scales fine via gossip, they cannot expose the standard client functionality without making on-demand requests for data from the network. Some nodes may be semi-stateful or retain a cache of recently seen or requested state, allowing them to serve some requests, however for the purposes of this document we can treat the majority of such semi-stateful nodes roughly the same as fully stateless ones since they would still exhibit similar needs, just on a a potentially smaller scale.

For the purposes of this document we will be exploring the idea of a network dominated by stateless clients which run on lightweight hardware and retain very little state. These clients would expose the maximum possible JSON-RPC API, specifically supporting eth_call which requires on demand access to arbitrary accounts and storage from the state. These clients would also participate in relaying transactions, requiring them to have on demand access to a broad range of account balances and nonces.

Asymetry in the Network

For now, we make no assumptions about the network topology. This means we won’t start with assuming continued use of the DevP2P ETH network topology, nor do we assert any new network topology.

However, what we can observe is that there will be some number of “full” nodes which have data, and a much larger number of “stateless” nodes which need the data. The full nodes have some combined capacity C and the stateless nodes have some combined need N. In the event that N > C the network can no longer meet the demands of the stateless clients.

The situation is more complex than this, because even in the case where N <= C the full nodes are still paying a cost to serve this data. This means that there is still a cost to full nodes even when operating below this capacity. This cost manifests in as non-trivial disk i/o and CPU overhead. For a miner, this can have a direct negative impact on their earnings.

Natural tendency towards selfishness

In such an asymetric network, stateless clients are consuming a limited “public” resource and the full nodes providing that resource will naturally exhibit self preserving behavior. Thus, the “tipping” point where demand exceeds capacity is likely much lower than the actual theoretical tipping point.

Other factors that play into this are basic client defaults. Currently it isn’t easy for a client to be selfish in a manner not already codified by the client developers, and the network is dominated by a single client, go-ethereum, which naturally has incentives to be sure it shares well enough with itself.

If we think ahead to a network that supports statelessness, where clients are easier to write and maintain, and there are a larger number of different clients operating on the network, it becomes more likely for one of more clients to exhibit behaviors that take advantage of the network as well as clients that facilitate more selfish behaviors by sharing less data than they could.

The natural end result is a network comprised of stateless nodes which take without giving back and full nodes which act selfishly with their data. In order to change the outcome, we have to change the behavior.

Changing Selfish Behavior

In decentralized networks we will tend to have little direct control over how people behave. Our only areas of concrete control are in the bounds placed by the protocol rules.

Behavior changes thus come either from internal or external motivation. Internal motivations can be considered but seem unreliable since we expect the network participants to be a diverse group with different and often disparate internal motivations. This leads us to external motivators, of which there is potentially more easy common ground.

  1. People are universally lazy, meaning that we can reliably expect most to take the path of least resistance.
  2. People are motivated by money, meaning that people will offer a service if they are adequately compensated for it and people will pay for a service if it provides enough value.

If full nodes are hard to run…

If a full client is hard to run and maintain then the lazy option is the one that preserves the clients ability to function. Hard to run full clients naturally result in fewer of them.

In this mode, very few will serve data if it is off by default, and eventually, most will turn the feature off if it negatively effects their client’s performance which would likely occur in the network described above.

If full nodes are easy to run…

If a full client is easy to run and maintain… then we’ve solved a problem we don’t know how to solve yet, and we don’t need stateless clients.

If contribution can happen beyond full nodes…

If we lower the barriers to contributing to the data network then … why do we need incentives??? It seems that as long as you constrain contribution to the data network to full nodes, there remains a reasonable argument that incentives are necessary. However, once you open up the option for stateless nodes to make meaningful contributions back to the network, the asymetry in the network goes away, and the argument that financial incentives are necessary no longer holds.

Failure case for the homogenous network

The “Homogenous” network fails in the same way as the asymetric network. There is still some “capacity” and “demand” and when demand outpaces capacity the network can no longer serve requests. To examine this, we need to examine the expected behaviors of network participants.

  • Full Nodes: stay online, maybe serve some data or even a lot of it.
  • Stateless and Semi-Stateless: some stay online for long periods. Some may hop on and off quickly.

I assert that the ones that stay online for a long time will predominantly have more capacity to serve data than they consume. There will be actors that consistently consume more than they provide but I would expect those to be in the minority as such a node would be less performant than a full node and the use cases which would put such a node under constant demand are those that would typically have inherent incentives to pay the costs associated with running a full node (backend infrastucture).

Thus, the leachers are those that jump on the network briefly, and then exit. It seems that the very nature of this type of node is such that it’s consumption would naturally be quite low. Examples are, quickly performing one-off operations like checking a balances, issuing a transaction, or reading some state. All of these are pure leaching behaviors but they all exhibit a very low usage footprint.

So for this network to fail, it would need to be dominated by leacher nodes which consume more than they provide. It seems this would involve a massive number of transient stateless nodes, or a large number of stateless nodes which have consistently high usage. Neither of these situations seems likely…

Posts: 1

Participants: 1

Read full topic

DHT based solution for the “Data Retrieval” problem under the stateless paradigm


@pipermerriam wrote:

I’m both providing a link to the working document, and a copy of it’s current state. The document is likely to continue to evolve. The current state hasn’t been wordsmithed or really edited much for easy digestability, but rather is a loosely structured dump of my thoughts on the subject.

Living document: https://notes.ethereum.org/g-npI1x7QRqoSzclvkLYzw?both

DHT Ethereum Data Network

An idea to try and use a single DHT for data storage.

TODO: This paper looks promising: https://arxiv.org/pdf/1608.04017.pdf

Data Structures

  • Chain Data:
    • headers
    • blocks (bundles of transactions and uncles)
    • receipts (bundled by block)
    • witnesses (need to be referenced by header)
  • State Data
    • accounts
      • referenced by path?
      • bundled by tile?
    • contract storage
      • dunno yet, tiling?
    • contract code
      • dunno yet
  • Canonical Chain Metadata
    • canonical header chain (trie of tries proof)
    • block-number-to-canonical-header-hash (trie of tries proof)
    • transaction-hash-to-canonical-headaer-hash (trie of tries proof)

DHT Keyspace

Data from different chains needs to be namespaced differently?



TODO: is forkid suitable for this, probably yes for old forks and maybe no for new forks since the forkid changes as soon as a there is a pending new fork, need a better mechanism…

  • Chain Data:
    • Headers: <prefix>/header/<block-hash>
    • Blocks: <prefix>/block/<block-hash>
    • Receipts: <prefix>/receipt/<block-hash>
    • Witnesses: <prefix>/witness/<block-hash>
  • State Data:
    • Accounts: `/account/<??tile-identifier??>
      • TODO: stateroot based granularity?
    • Contract Code: <prefix>/contract-code/<code-hash>
      • Is this an acceptable way to do contract code that handles deduplication?
      • How do we mitigate against spam (we only want to store code that is part of at least one contract…)
    • Account Storage: <prefix>/storage/<address>/<slot>
      • TODO: stateroot base granularity?

A possible alternate keyspace for state.

  • <prefix>/state/<block-hash>/<address>/account
  • <prefix>/state/<block-hash>/<address>/code
  • <prefix>/state/<block-hash>/<address>/storage/<slot>

These give state root level granularity but result in a complex amount of high key churn for each block, introducing something like a few thousand new keys for each block, many of which will not have their values changed…

Mapping keys to storage locations…

One desired property of the function which maps content to locations on the network would be for it to facilitate nodes storing chunks of the data.

  • For headers/blocks/receipts/witnesses this would mean having two subsequent block numbers of data map themselves onto adjacent-ish locations in the network.
  • For accounts we can accomplish this pretty easily by distributing the whole account range evenly across the network keyspace.
    • This is potentially attackable by someone mining addresses that are close together and creating a heavy spot in the key space…
  • For contract storage we’d want the keyspace for all contract storage to map evenly across the entire keyspace which isn’t trivial since contract storage isn’t evenly distributed…

Whatever the mechanism, we want uniform distribution across the entire address space…

Rendevous Hashing looks promising as a simple well tested way to map data into multiple locations across the network. Multiple locations might not be necessary but it seems like a nice mechanism to encourage redundancy.

Deciding who stores what…

One aim is for the network to have as simplistic a model as possible for participation. That means that writing a client for this network should be as simple as possible.

Much of what follows may in many ways mirror the functionality of the SWARM network.

Content in the network is comprised of a key and a value. The key would be one of the namespaces above. Each key maps to a set of storage location in the network s0, s1, s2, ....

Each node in the network’s public key denotes their position in the network. The prescribed behavior of each node is as follows.

  • Each node maintains a “Data Store” which is tied to their NodeID: D
  • Each data store has a capacity C measured in bytes.
  • The current total size of all data in the data store (excluding key sizes) is S
  • For each key, let r = radius(k, nodeid) where radius(k, nodeid) calculates the distance between the nodes location and the closest storage location for the given key.
  • Each time a new piece of data with key k and raw bytes value v is encountered:
    • If S + len(v) <= D: store the data
    • If the radius of the new data is smaller than any existing data in the data store, then the data store evicts the items with the largest radius until there is room for the new item.

Each client will also need to maintain some global state of the network. This is done by tracking the header chain. This global network state is vital for validating the data that is flowing into the storage network.

  • Headers: standard header validation done by eth clients.
  • Blocks: validate transactions and uncles against header
  • Receipts: validate against header receipt root
  • Witness: validate against hash from header
  • Account State: validate against state root from header.

All of headers/blocks/receipts/witness should be kept indefinitely.

Account state however may need to behave slightly differently. It would be ideal if this network could operate as an archive node, however that is complex. Since nodes don’t have a full picture of the state trie, they will need to store proofs about the accounts they are tracking. That means naive storage will involve a lot of duplication across the various proofs.

TODO: can we use a different hash function for account storage so that each network node maps to a single location in the account trie and the node instead stores accounts that are near them. This allows for effective range queries across accounts that can be distributed across multiple nodes.

TODO: how do we do contract storage. Unification of the storage tries would make this a lot easier. The goal would be a simple mechanism that allowed us to map nodes into ranges of the storage in a manner than automatically balances across the nodes of the network while easily allowing nodes to store ranges of the storage items.

Storage Ranges

This is partially covered in the mapping content to storage locations section.

We want to optimize for certain use cases:

  • Full nodes syncing
    • requests chain data in sequential ranges
    • requests state data in sequential chunks, enumerating the key space.
  • Stateless nodes
    • unknown, but likely has a totally random request pattern?

In order to optimize for efficiency, the syncing node case suggests we want nodes to store ranges of data.

  • Headers are constant size, so they can be chunked into constant size chunks.

Blocks/Receipts/Witnesses are all variable size. No immediate obvious solution for efficient chunking which bounds the chunk sizes…

The various “state” values are also dynamic in size. Tiling is supposed to solve this for accounts. I don’t know how contract storage works but probably the same. Need to figure out how Tiling works.

Discarding old data:

Should/can the network operate as an archive node? or must it discard old state?

The account state data may need a different storage model than the chain data. We only really need to keep around the data for recent tries. It would be ideal if we could just keep everything, but research needs to be done to evaluate feasibility. Under the assumption that keeping everything is too expensive for the network, then nodes would need logic to discard state data from old state roots.

Inclusion Proofs via trie of tries accumulators

Use a trie of tries to maintain light proofs about chain history and efficient reverse lookups.

TODO: explain this in detail…

Data near the head of the chain can be maintained in memory. TODO: full description of what a node would keep in memory and evaluation of how hard it is to maintain those data structures, as well as how a node joining the network would build or acquire those structures.

How data is requested

A full syncing node

  • Enumerates all headers, typically by range
  • Enumerates all blocks, receipts, typically by hash, but often in sequential ranges.
  • Enumerates the account state trie.
    • at specific state root(s)
  • Requests the code for all accounts by their code hash
  • Enumerates the storage tries for all accounts
    • at specific state root(s)

A stateless node

  • Sparse requests for headers, blocks, receipts
  • Random requests for proofs about accounts and their storage
    • at a specific state root(s)
  • Random requests for the code for an account
    • by address?
    • by code-hash?
  • Random requests for index lookups (txn -> block, number -> block)

The volume of requests issues by a stateless node isn’t necessarily bounded but there’s likely an upper bound for which we could consider reasonable. Maybe the protocol can take this into account?

Attacks that need to be understood and potentially mitigated

  • Malicious routing?
    • Sybil attacks which eclipse a part of the network to wall off data.
  • Ensure that routing doesn’t result in cycles
  • DOS/flooding/amplification

Things that are dealt with already?

Spam is probably not a big deal since all requests to store content should be verifiable against the header chain.

Thoughts on advertising what you have…

It is unclear whether the protocol needs the ability for nodes to be expressive about what data they have available. Reading some research documents on distributed content networks it appears that there are some formal mechanisms for this that can also be used to do things like provide anonymity see this paper

Thoughts on ensuring everyone’s view of nodes is homogenous

Worth stating that an explicit goal of this network is for nodes to be treated homogenously. A “full” node could participate in this network alongside a fully stateless node and all other things being equal (hardware, bandwidth) the two nodes should be interchangeable from the perspective of any other peer, and a peer should not be able to reliably discriminate against one or the other (ignoring the ability to use things like heuristics to discriminate based on latency or something similar).

“Infinite” Scaling With One Beneveolent Full Node

The network provides no guarantees about data persistance. In fact, it is expected for some pieces of data to go missing occasionally. An explicit goal is to have the network operate in such a way that a single “benevolent” node which has access to the full node data could monitor the data network for missing data and then push that data back into the network to patch the holes.

Posts: 1

Participants: 1

Read full topic