The January Rollercoaster

Cryptocurrency markets had one of their worst crashes ever this week with the total market cap falling from $752 billion to $414 billion, a 44.6% drop over a 2 day period, as shown below. Over $330…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Collecting metrics data from decentralized systems

All decentralized networks, including blockchains and other P2P systems face the technical problem of how to gather metrics and statistics from nodes run by many different parties. Achieving this isn’t exactly trivial, and there are no established best practices.

We faced this problem ourselves while building the Streamr Network, and actually ended up using the Network itself to solve it! As collecting metrics is a common need in the cryptosphere, in this blog I will outline the problem as well as describe the practical solution we ended up with, hoping it will help other dev teams in the space.

Getting detailed real-time information about the state of nodes in your network is incredibly useful. It allows developers to detect and diagnose problems, and helps publicly showcase what’s going on in your network by building network explorers, status pages and the like. In typical blockchain networks, you can of course listen in on the broadcasted transactions to build block explorers and other views of the ledger itself, but getting more fine-grained and lower-level data — like CPU and memory consumption of nodes, disk and network i/o, number of peer connections and error counts etc — needs a separate solution.

One simple approach is that the dev team sets up an HTTP server with an endpoint for receiving data from nodes. The address of this endpoint is then hard-coded to the node implementation, and the nodes are programmed to regularly submit metrics to this endpoint. However, authentication can’t really be used here, because decentralized networks are open and permissionless, and you won’t know who will be running nodes in order to distribute credentials to those parties. Exposing an endpoint to which anyone can write data is a bad idea, because it’s very vulnerable to abuse, spoofing of information, and DDoS attacks.

Another approach is to have each node store metrics data locally and expose it publicly via a read-only API. Then, a separate aggregator script run by the dev team can connect to each node and query the information to get a picture of the whole network. However, this won’t really work if the nodes are behind firewalls, which is usually the case. The solution also scales badly, because in large networks with thousands of nodes, the aggregator script is easily overwhelmed trying to query the data frequently from each node.

However, a fully decentralized approach is certainly possible, which decouples the data producer and data consumer, requires no explicit sign-up, and leverages a decentralized network and protocol for message transport.

Let’s list some requirements for a more solid metrics collection architecture for decentralized networks and protocols:

The solution is based on a decentralized pub/sub messaging protocol (in my example, Streamr) to fully decouple the metrics-producing nodes from the metrics consumers. Nodes make data available via topics following a standardized naming convention, and metrics consumers pick-and-mix what they need by subscribing to the topics they want. In the Streamr protocol, topics are called streams and their names follow a structure similar to URLs:

domain/path

The path part is arbitrarily chosen by the creator of the stream, while domain is a controlled namespace where streams can be created only if you own the domain. In Streamr, identities are derived from Ethereum key pairs and domain names are ENS names. If your network uses different cryptographic keys, you can still derive an Ethereum key pair from the keys in your network, or generate Ethereum keys for the purpose of metrics.

In our own metrics use case, i.e. to gather metrics from the Streamr Network itself, each node publishes metrics data to a number of predefined paths under a domain they automatically own by virtue of their Ethereum address:

<address>/streamr/node/metrics/sec

<address>/streamr/node/metrics/min <address>/streamr/node/metrics/hour <address>/streamr/node/metrics/day

Data points in the streams are just JSON objects, allowing for any interesting metrics to be communicated:

Additionally, each data point is cryptographically signed, allowing any consumer to validate that the message is intact and originates from the said node. End-to-end encryption is not used here, as the metrics data is intended to be public in our use case.

With the above, node-specific metrics streams can now be obtained by anyone to power node-specific views, and various aggregate results can also be computed from them. For people in the Streamr community, including those of us working on developing the protocol, aggregate data about the Streamr Network is very interesting.

To compute network-wide metrics and publish them as aggregate streams, an aggregator script is used. The script subscribes to each per-node metrics stream, which it finds and identifies by the predefined naming pattern, computes averages or sums for each metric across all nodes, and publishes the results to four streams:

streamr.eth/metrics/network/sec streamr.eth/metrics/network/min streamr.eth/metrics/network/hour streamr.eth/metrics/network/day

The different timeframes seen here serve a similar purpose as the timeframes seen in the per-node metrics streams. Note that these streams exist under the streamr.eth ENS name as the domain, making the names of these streams more human-readable and indicating they are created by the Streamr team.

We were able to solve our own metrics collection problem using the Streamr Network and protocol, so a similar approach might come in handy for other projects too. Most of the developer tooling in the crypto space is still new and immature; many problems in decentralized devops including metrics and monitoring are missing proper solutions. I hope this post helps outline some best practices, gives an example of how to model the metrics streams, and shows how to derive network-wide aggregate metrics from the per-node streams.

Add a comment

Related posts:

Hypergamy and the Gender Wars

With the exception of intentional delusion caused by imbibing too many Cosmo articles, EVERY woman knows that youth is beauty, and youth is fleeting. Beauty is the primary way in which a woman turns…

Advice for Picking Model for Simple Classification Task

Classification has been one of the oldest and most classical problems in statistical analysis. There are many existing machine learning models that could be utilized. For this blog, we would provide…

Statutory Severance and Common Law Severance

When employees are dismissed from their jobs without cause, they’re entitled to notice in advance of their dismissal or pay instead. This amount is to compensate for the time they’ll need to find new…