Difference between revisions of "Regional Multipliers for MainNet"

From xx network wiki
Jump to navigation Jump to search
m
Line 1: Line 1:
[[File:Bins WorldMap-1030x612.png|thumb|right|Bins World Map]]
Two weeks ago, we announced updates to the geographic bins and that this would feed into a test for the regional multipliers—that time has come. Using last week’s data, we computed new multipliers for every region:
Two weeks ago, we announced updates to the geographic bins and that this would feed into a test for the regional multipliers—that time has come. Using last week’s data, we computed new multipliers for every region:



Revision as of 20:35, 24 June 2022

Bins World Map

Two weeks ago, we announced updates to the geographic bins and that this would feed into a test for the regional multipliers—that time has come. Using last week’s data, we computed new multipliers for every region:

Bin Multiplier
Oceania 1.457
EasternAsia 1.402
SouthernAfrica 1.381
SouthAndCentralAmerica 1.267
WesternAsia 1.211
NorthAmerica 1.023
CentralEurope 0.962
Russia 0.944
EasternEurope 0.905
MiddleEast 0.899
Western Europe 0.873

With these multipliers, we should offset the latency factors for running nodes in various regions.

This is an imperfect system. Regions like the Middle East, which only have one node, are reading results that are properly not really descriptive of their general performance. Other regions, like North Africa, have no nodes at all and, as a result, will have to have their values guessed at.

But, an even bigger challenge is that the correct multiplier is a function of how many nodes a region has. When designing the network, we made it so that every node in a region experienced the highest multiplier of any node in the team. This is to ensure there never is a case where node operators are incentivized not to participate in a round. As a result, the correct multiplier for a given region is a function of the number of nodes in that region.

This means that multipliers will need to be updated on a regular basis as the network grows and shrinks. As seen below, the math for multipliers is relatively regular, so it may be possible to integrate it on-chain at some point and automatically calculate it every era.

This solution also requires adjusting how multipliers are handled. Initially, all nodes in a team inherited the same multipliers. This resulted in some very odd multipliers, so we are adjusting the algorithm to give all nodes the average of their own multiplier and the highest in the team.

Calculating the Multipliers

When calculating the multipliers, we started with the raw numbers, how many points each node got in every era, excluding the multiplier. We want everything to be fair, and we want each node runner to be incentivized to team with every other node. So, our goal is to ensure full participation gets you full “points” in the network.

Before we continue, some definitions:

  1. Mi – The multiplier for all nodes in Bin i.
  2. Ai – The adjusted point value for Bin i.
  3. Pi – The probability that at least one node from bin i is included in a team, and a node from any bin < i is not included.

The calculation is seeded by real network data. Over the last week, we got the average points earned for cMix operations in an era per node. This was then “normalized” to produce the adjusted point value for the bin, known as Ai.

The goal of this system is to create Multipliers such that future As will all be 1.

Bins are ordered minimum to maximum by A. When calculating points, every node’s multiplier is the average of their multiplier and the highest in the team (also the lowest i). 

Because the region with the lowest average, Bin 0, overwrites all others, Bin 0’s multiplier can be calculated as:

1 = (M0 + M0) A0/2 = M0 A0

Our goal is to ensure that the multipliers and adjustments for all nodes in all teams are also the maximum of 1 on average. Since Mn is the multiplier for the nth ordered bin and An is the normalized average for the nth ordered bin. This can be easily solved:

M0 = 1/A0

The next multiplier for Bin 1 can be calculated as the probability a node from Bin 1 is selected with a node from Bin 0 (which uses the M0 bin multiplier) and the probability that a node from Bin 1 is not teamed with a node from Bin 0 (which uses the M1 team multiplier):

1 = (M0 + M1)/2 A1 P0 + (M1 + M1)/2 A1(1 − P0)

We can do the same calculation for Bin 2:

1 = (M0 + M2)/2 A2 P0 + (M1 + M2)/2 A2 P1 + (M2 + M2)/2 A2 (1 − P0P1)


2 = M0 A2 P0 + M2 A2 P0 + M1 A2 P1 + M2 A2 P1 + 2M2 A2 (1 − P0P1)


2/A2 = M0 P0 + M1 P1 + M2 (P0 + P1 + 2(1 − P0P1))


2/A2M0 P0M1 P1 = M2 (P0 + P1 + 2 − 2P0 − 2P1)


2/A2(M0 P0M1 P1)/2 − (P0 + P1) = M2 P0


We can generalize this using summations for any team multiplier Bin n:

2/An Mi Pi/2 − Pi = Mn

In addition to the recursive definition, the probabilities are tricky to get right. Each selection of a team of 5 nodes consists of a random draw, without replacement, from the total of nodes in the network. This sort of selection is described by a hypergeometric distribution. Most frequently, this distribution is applied to a simple case of counting objects (nodes in our case) with a binary feature (for example, in the BFT consensus realm: byzantine/honest nodes). However, in cMix, we have split the nodes into 12 bins, which turns the problem into a multidimensional one, meaning that we need to calculate a multivariate hypergeometric distribution. Luckily, due to the nature of how multipliers work, when we select a node from say, Bin 2, we don’t care if any other node in the team is of a bin with a higher multiplier. This means that all the probabilities that we need to compute are similar: we want to have a team with at least one node from bin i, without any nodes from all bins < i. We include a spreadsheet in our resources below which show how to do this in detail.

Error

This solution has some errors due to the fact that discrepancies in A values are largely caused by variations in how long rounds take, which impacts selection probabilities. This solution, as described by our simulation, is correct to within 3%. Future work can deconstruct the causes of point variations to model them better and reduce this error.

Resources