How to build an Ethereum mining pool | by Ivan Bogatyy | Dragonfly Research

0
Ivan Bogatyy

Mining pools are major power players in the Ethereum ecosystem. With miner-extractable value (“MEV”) growing exponentially, the passing of EIP-1559 and the upcoming merge, they have become louder and increasingly important actors in the ecosystem.

For the uninitiated: mining pools are software providers who enable many mining machines to pool together their mining power and share rewards. Mining pools are essential in PoW mining on two levels: first, because earnings for individual miners are highly volatile, and second, because setting up the software infrastructure around mining is increasingly complex. By pooling resources, individual miners can lower variance and have a more predictable business.

But with this power comes great responsibility, and mining pools hold a lot of power. This is because mining pools ultimately decide which blocks get worked on by their miners and which transactions are included in those blocks. Mining pools decide on what MEV gets extracted and who gets to extract it, they vote on the gas limit, and they take part in major political battles. That’s why it’s essential to Ethereum culture that the barrier to entry for mining pools be as low as possible, to maximize decentralization.

When MiningDAO set out to build our own independent pool, we were surprised to find that it was incredibly challenging! There’s very little open and publicly shared info on how to run a competitive mining pool, and a lot of the open-source software is out of date. So we figured: let’s fix that by releasing an open-source, step-by-step guide.

Building a pool consists of two parts: (1) setting up a full node client with good peer-to-peer networking and fast processing speed, and (2) connecting the full node to pool software that manages hashrate and distributes workload across all the miners. Here, we’ll cover both.

This guide comes from our first-hand experience building the MiningDAO.io pool, and outlines how we brought our uncle rates from 10%-14% down to approximately 4%-5%, on par or better than some top-10 pools.

Bandwidth is important as well. It is preferable to co-locate as close as possible to other nodes, to receive new blocks as soon as possible. We recommend cloud-hosted dedicated machines on services popular with other pools: OVH and Hetzner in Europe, Alibaba and AWS in Asia.

For comparison, we did some small-scale experimentation with Parity-2.7.2 (latest stable branch before the OpenEthereum refactoring) and OpenEthereum, but both had poor results with block import times and block production times, leading to unacceptably high uncle rates. We welcome anyone to perform a more thorough A/B test and reach out to us with more data, but at the current stage we simply recommend Geth.

Here are the flags we use:

geth --datadir=/ssd/gethdata --syncmode=fast --cache=21000
--maxpeers=250 --txpool.globalslots=1000
--http --http.api=eth
--miner.etherbase='0xADDRESS' --mine --miner.threads=0
--miner.extradata='MiningDAO'
--miner.notify='http://127.0.0.1:8107' &>> ~/geth-log.txt

Here --cache=21000 means to allocate 21GB for in-RAM state storage (the most Geth can handle), and the remaining flags will be explained below.

More importantly, the modifications to the Geth code we will describe below can be found here as a repo to download, or here as a patch to apply.

When another pool mines a fresh block (say at height N), any other blocks at height N are likely to become uncles. So whenever a new block is found, Geth instantly switches the miners’ jobs to mine an empty block at height N+1. This empty block does not have transaction fees, but that is still better than mining blocks destined to become uncles. Subsequently, geth constructs a “real” block at height N+1, and switches the miners’ jobs once again. Constructing such a “real” block takes time (0.1-0.3 seconds), hence the two-step process. But in that interim 0.1-0.3 seconds-long period miners are working on an empty block.

It might be tempting to collect all the pending transactions to maximize fees, but getting greedy with --txpool.globalslots substantially increases the amount of processing Geth has to do to construct a “real” block (up to 1 second and more). We recommend values no larger than 1000 or 2000.

For more details on this, check out https://github.com/ethereum/go-ethereum/issues/21899

  1. when other pools produce new blocks, learn about it as quickly as possible
  2. when your pool produces a new block, propagate it as widely as possible (so others start mining on top of it)

The first step to good p2p is, as explained earlier, running your full node on a cloud server with good bandwidth next to other nodes.

Second, good bandwidth allows the node to handle more direct peers, thus reducing the number of p2p hops necessary to receive new information. The Geth flag for the number of peers is --maxpeers.

Below we will explain a few more nuanced and powerful tricks to maximize block import speed and block propagation speed.

1.4.1) Use bloXroute

Our experiments showed significant improvements as well. On a freshly-synced node with default peer settings, approximately 90% of all new blocks come from bloXroute first (and only 10% from all other peers). Even after our node has been fine-tuned to connect to top peers, still 40%-60% of new blocks come from bloXroute first.

After following the bloXroute setup tutorial, don’t forget to add the bloXroute node into the “trusted peers” set for your Geth, we will need that later. Trusted peers are pre-set nodes that Geth will always connect to, irrespective of the random peer initialization process. Trusted peers also do not count against the connections limit. Adding the bloXroute gateway to trusted peers ensures Geth will not accidentally drop that connection.

We further recommend connecting to Taichi Network. Taichi is a block propagation network developed by Sparkpool. Connecting to it can be done by adding the Taichi endpoints to the same trusted peers file.

1.4.2) Propagate your blocks aggressively

In particular, the first thing to do upon mining a new block is to send it to bloXroute, so that it will be forwarded to all the other participating mining pools. If the bloXroute gateway doesn’t end up in that random sqrt(n_peers) subset, your chances of getting uncled go way up!

Next, you’d want to send the block to the highest-quality peers, and then to all the remaining peers.

We have open-sourced the following Geth patch and recommend applying it to your client. It propagates all newly mined blocks to all trusted nodes (including bloXroute), and then to all remaining peers.

1.4.3) Cultivate the most well-connected peers

In reality, not all peers are equally useful. Some have slow connections and will neither supply new blocks nor help your blocks propagate. Others, especially the nodes of other mining pools, will produce a constant stream of new block data. Following advice from Sparkpool, we tweaked our Geth to log which peers were the first to send us a new block. Collecting that data for several months allowed us to figure out the best peers to always keep connected to (via the “static”/”trusted” node settings in Geth). Here is a Python script we used to process that data and convert it into a trusted_nodes.json list that Geth can ingest.

Because MiningDAO is present in each geozone (North America, Europe, Asia), we have data-mined the lists of top peers for each geozone. Unfortunately, we cannot share the node IPs publicly to avoid the risk that these nodes will be DDoSed. Can probably share privately on serious inquiries with good justification. Also happy to share our own nodes in each geo for other pools to connect!

We had a great experience with Miningcore for two reasons. First, it keeps all past data on disk in an SQL database, unlike open-ethereum-pool, which keeps data only in RAM via Redis. Disk storage offers stability against reboots and ability to analyze historical data. Second, we enjoyed the highly readable, object-oriented code of Miningcore.

Ultimately, MiningDAO ended up implementing our own pool software, written in Go for speed and modeled after Miningcore. We expect to open-source our implementation soon, but in the meantime we recommend using Miningcore.

For context, there is an easy way to calculate the increase in uncle rates from any processing delay. Block times are Poisson-distributed, which means that no matter how long it has been since the last mined block, the probability of finding the next block in the next second (or millisecond or whatever) is always the same. For example, Ethereum targets 13-second block times, which means the probability of a block being mined in the next second is always 1 sec / 13 sec ~= 7.7%. So if your mining pool has a 0.1 sec delay anywhere in the pipeline for any reason, it will have 0.1 sec / 13 sec ~= 0.77% extra uncle blocks as a result of that delay. The uncle blocks will come from that 0.1 sec period of time that your miners are working on an outdated job.

Back to Miningcore. Using the formula above, a 0.5 second delay in updating the miners’ job will lead to 0.5 sec / 13 sec ~= 4% extra uncle blocks (absolute, not relative percentage). Naturally, such a high rate of unforced errors is unacceptable. We have experimented extensively with lowering the update frequency from 0.5 seconds down to 50 milliseconds and below, but found that setting rather unreliable: the updates were still significantly delayed.

A better solution is to make use of Geth’s notifyWork feature, so that Geth proactively sends job updates to the mining pool software as soon as they appear. We patched Miningcore to support this option, and released the modification. After transitioning to notifyWork, we found the communication delays between Geth and Miningcore to become practically negligible, and thus our uncle rates significantly improved.

Our Geth modifications can be found here as a repo and here as a patch. Our Miningcore modifications can be found here, and a corresponding pool config file can be found here.

If you have further ideas on how to improve this setup, please send us a pull request or an email!

Miningcore speedup developed by Alexander Melnikov. Many thanks for the suggestions and ideas that came from Alex Obadia (Flashbots), Eyal Markovich and Shen Chen (bloXroute), Xin Xu and Dr. Yang Ze (Sparkpool), Chris (Flexpool) and Haseeb Qureshi (Dragonfly Capital).


Credit: Source link

Leave A Reply

Your email address will not be published.