Testing Sentry Node Architecture on Radix STOKENET

Disclaimer: Radix’ codebase currently does not support relaying connections to validators in order to correctly enable sentry node architecture. Running a validator in such a topology could lead to missed consensus rounds and corresponding loss of emissions — getting increasingly worse as more people use the pattern. It is strongly recommended by the Radix team that node-runners use a standard topology using a backup full node pattern to mitigate DDoS attack until a sentry node-like architecture is supported in the Radix node.

Introduction:

Radix is a new DeFi platform with seemingly unlimited abilities to scale. The multi-sharded distributed ledger applies a new Byzantine Fault Tolerant (BFT) consensus algorithm (“Cerberus”) and recorded an astonishing 1.4M tps in a proof of concept. On 28/07/2021 the network launched v1 of its “Olympia” mainnet, which utilizes an un-sharded version of the later to come sharded Cerberus codebase.

Sentry Node Architecture is an infrastructure example for DDoS mitigation on validator nodes and is very successfully implemented by the majority of validators on a number of DPOS networks, including cosmos (& cosmos-sdk chains), binance smart chain or polygon.

To divert possible direct attack vectors on validator nodes, multiple distributed, non-validating full nodes (sentry nodes) are deployed in cloud environments. These sentry nodes each establish a private, direct connection to the validator node (through VPC or static routing), enabling the validator to be hosted in a very restrictive environment.

Validator nodes peering solely to sentry nodes can either be hosted in the same remote data center as the sentry nodes (less secure), or utilize private connections via vpn or direct routing to be run in a more secure environment of a private data center.

basic sentry node topology. source: forum.cosmos.network

Many well-established tendermint validators use this mixed architecture of private servers in regional data centers, supported by cloud nodes spread across several cloud service providers in different regions. If a sentry node fails or is brought down by a DDoS attack, new sentry nodes can easily be booted to replace the compromised node, making it harder to impact the validator. Furthermore, validator operators can utilize this topology in various ways, combining the power of physically accessible hardware (keystorage, hardware firewalls…) with high connectivity and availability of distributed data center nodes.

Test:

In our effort at CryptoCrew Validators to build the most-secure validator systems possible, we recently tested basic sentry node architecture on an active validator setup on Radix-stokenet (testnet).

For this test we used a simple combination of 2 sentry nodes (non-validating full nodes with port 30000 enabled to the public for gossiping connections), as well as one validator node with port 30000 enabled exclusively for the private (internal) IP addresses of the sentry nodes — and with the node-software bound to it’s internal eth1-interface address.

system monitoring

Tests began with an average stake of 0.56 % voting power and were run for 2 days before increasing the voting power of the validator to 3%. The validator node seemed to behave fine with only one dropped consensus proposal opposed to over 2000 proposals made. Even during the stress-test with high validator-consensus- participation no further proposal got dropped.

Radix Sentry Test 1: consensus proposals made vs. consensus proposals missed; sync difference
Radix Sentry Test 1: bft metrics

Problem: Currently Radix’ core code-base doesn’t support correctly relaying connections through full nodes. Validator nodes being run in such a topology will not allow any inbound connections. Instead, the validator node is making outbound connections to other peers in the network and that’s what’s being used for communication. Other nodes can’t connect to that validator (as initiator), but they re-use the outbound connection initialized on the validator’s side.

Peer-connection output of another radix-stokenet node during the test.

Conclusion:

The only seeming way to avoid this is if sentry nodes are actually able to act as proxies in order to relay consensus messages to the protected validator node. In our test, our public full nodes were serving as private bootstrapping nodes, which is not actually fully serving the sentry purpose. To achieve “real” sentry node architecture there has to be a built-in solution in the node’s core-codebase.

During our test, the Radix core dev-team has considered options for providing a more complete proxy solution, but did not disclose any concrete plans as for now.

In the meantime It is strongly recommended that any node-runner incorporates a standard validator topology using a backup full node pattern to mitigate DDoS attack until a sentry node-like architecture is supported in the Radix node.

Summary:

It was very interesting to experiment with Radix’ network topology and we’re happy that we were able to acquire some valuable insight. Of course, our results lead to the conclusion that for our “Olympia” mainnet-validator we choose a standard topology, utilizing fully redundant double-oversized (32 GB RAM + 8 cores) nodes, hosted in Tier 3+ data centers.

Stake your XRD with CryptoCrew Validators ✅

Radix mainnet XRD delegation address:
rv1qtsyl0q7nl0642dp9nehp5579cclskxg6v70yphy5wcfxpmjfqc66s4l9md

author: @Claimens, CryptoCrew Validators

sources: radix, cryptocrew, https://www.radixdlt.com/, https://getradix.com/, https://radixdlt.medium.com/, https://medium.com/figment, https://forum.cosmos.network/

Special thanks: @Matt (Radix), @Lukasz — from Radix, @Shambu — from Radix, @AVaunt, @Fpieper, @Emmoglu.io | magal36

Stake your assets with reliable, community-driven validators. cryptocrew.cc/validators

Stake your assets with reliable, community-driven validators. cryptocrew.cc/validators