Skip to content

Endpoint weighted selection algorithm fails if weights are 0 #153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
chlily1 opened this issue Apr 1, 2019 · 7 comments
Open

Endpoint weighted selection algorithm fails if weights are 0 #153

chlily1 opened this issue Apr 1, 2019 · 7 comments
Assignees
Labels
network reporting Issues relating specifically to the network reporting spec report-delivery Covers the core delivery of reports via HTTP

Comments

@chlily1
Copy link

chlily1 commented Apr 1, 2019

An endpoint's weight is a non-negative integer, i.e. it is allowed to be 0. If all the endpoints with the minimum priority value have weight 0, then in step 9 of the Choose an endpoint from group algorithm, it will be impossible to choose a random weight value that is >= 0 and also < total_weight (i.e. 0).

Also, the algorithm as written does not give endpoints with weight 0 any chance of being selected. This differs from the cited algorithm in RFC 2782, which does give endpoints with weight 0 a (small) chance of being selected.

(Credit to Eric Orth for pointing this out.)

sburnett pushed a commit to sburnett/reporting that referenced this issue May 7, 2019
…gorithm.

This clarifies language as mentioned in w3c#153.
dcreager pushed a commit that referenced this issue May 14, 2019
…lgorithm (#154)

This clarifies language as mentioned in #153.
@dcreager
Copy link
Member

The change in #154 brings us closer to the semantics of a SRV record (still need to give 0-weight endpoints a chance when there are any nonzero-weight endpoints).

But this makes me realize that SRV weights aren't as intuitive as I first supposed!

Let's say you have two endpoints, A and B, with weights 1 and 2. My intuition would be that endpoint B is always chosen twice as often as endpoint A.

But RFC 2782's algorithm would mean that each client is either (a) three times more likely to choose endpoint B, or (b) equally likely to choose A or B! And both (a) and (b) happen with equal probability:

        (a)                  (b)

1. Arrange each endpoint in any order (with 0-weight endpoints first).
       A, B                 B, A

2. Compute the sum of the weights.
        3                    3

3. Calculate the running sum for each endpoint.
     A=1, B=3             B=2, A=3

4. Choose $RANDOM between 0 and 3 inclusive.  Pick
   the endpoint whose running sum is ≥ $RANDOM.
    0: A, 1: A        0: B, 1: B, 2: B
    2: B, 3: B        3: A

I'm not sure how I feel about this. If we want to strictly follow the SRV record semantics, this is what it is. An alternative would be to revert #154, and to instead require that our weights be strictly positive.

@dcreager
Copy link
Member

/cc @sburnett

@chlily1
Copy link
Author

chlily1 commented May 14, 2019

Is there a good reason to allow endpoints with weight 0 to be selected in the presence of non-0 endpoints? As I commented on @sburnett's Chromium patch, it seems like the SRV algorithm does this because "Domain administrators SHOULD use Weight 0 when there isn't any server selection to do, to make the RR easier to read for humans (less noisy)." Given our non-0 default, sites have to go out of their way to select weight 0, so it doesn't seem to make much sense to me. I think it's just simpler to not allow 0 weight endpoints to be selected in the presence of non-0 weight endpoints.

@sburnett
Copy link
Contributor

I think the SRV algorithm makes more sense if you use a floating point RNG rather than an integer RNG. But there's no mention of which one you're supposed to use.

@sburnett
Copy link
Contributor

At this point perhaps it makes more sense to remove references to SRV from the spec and spell out the algorithm directly in the Reporting spec instead.

@gapple
Copy link

gapple commented May 28, 2019

My understanding with the example weights of A: 1 and B: 2, if the arrangement of endpoints is randomly chosen the effective probabilities are:

P(A) = 1/2 * 2/4 + 1/2 * 1/4 = 0.375
P(B) = 1/2 * 2/4 + 1/2 * 3/4 = 0.625

which is close, but not quite the expected P(A) = 0.333, P(B) = 0.666

@dcreager dcreager added the report-delivery Covers the core delivery of reports via HTTP label Jun 26, 2019
@clelland clelland added the network reporting Issues relating specifically to the network reporting spec label Jun 20, 2022
@clelland clelland self-assigned this Jul 29, 2022
@clelland
Copy link
Contributor

I'll remove the reference to SRV from the spec; given the subtle differences it's not correct to say that the fields have the same meaning, or that the algorithm is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
network reporting Issues relating specifically to the network reporting spec report-delivery Covers the core delivery of reports via HTTP
Projects
None yet
Development

No branches or pull requests

5 participants