-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Let browsers have different privacy settings #11
Comments
Hi @michaelkleber! Sorry for our delayed response, @johnwilander and I are tied up with WWDC this week. Our read of the comment -- that we would appreciate your confirmation on -- is that there are three pieces here regarding making the API generalizable:
|
Thanks @jasonanovak, sorry to intrude on your busy week, and happy to wait for when you're less encumbered. I'm not sure I would split up the questions quite that way, because bits of entropy aren't simply additive — after all, the conversion itself is an example where the browser is happy to report zero bits of entropy from the So I think my pieces are more like:
|
Can you clarify this:
@johnwilander and I read the "infinite bits" piece differently. Are you referring to the unlimited data that can be sent in pixel request URLs or expanding the number of bits beyond 12? |
I meant that you could get unlimited data from just one side of the click via a pixel request URL, not using the API at all. |
Thanks. Our thinking is that in the future with the JS API, there wouldn't be unlimited data from the pixel request URL. |
I look forward to hearing about your JS API thoughts! But apologies for that digression into "6+6"-vs-"0+∞" bits. I still don't know your feelings on the key questions:
|
It’s harder for us to answer in your reframing because we have different feelings about changing the split and increasing the budget. Flexing from 6+6 to 4+8 seems in the spirit of the proposal, even if under publisher or advertiser control (rather than browser control). It’s not really clear why budget split would be useful as a browser choice, rather than publisher or advertiser. Increasing the total to 32 would make the proposal no longer privacy preserving, since that’s enough for a globally unique user ID. Maybe lower totals could make the name still accurate, and we are not necessarily stuck on 12 or on a single fixed value, based on evidence. I think your example of 0+infinity is not on point. With a shot in the dark tracking pixel, the publisher has no way to know the same user saw an ad for the advertiser on their site (at least under ITP, which generally prevents making this association using cookies). So such a mechanism can’t be used for attribution or for cross-site tracking. It’s associating some bits of information with the publisher-advertiser pair and knowledge that some user interacted with both that provides attribution as well as tracking risk. |
Budget Split Building on what @othermaciej said, maybe there’s a way to do a split like 4+4+4 where the middle 4 bits in a 4+4+4 split can be used by either website and could be negotiated by the two first parties as part of their business agreements. We cannot let the click source signal to the click destination how many bits it has used since that in itself is a carrier of entropy. So a dynamic budget split will require the click destination to signal the conversion ID as a 4+4 value where only the first 4 bits are guaranteed to be used. Increased Budget Key to the “Privacy Preserving” nature of “Privacy Preserving Ad Click Attribution” is the limitation on the number of bits stored and sent to prevent cross-site tracking. The existing budget of 12 bits allows for the unique identification of 4,096 individuals if the ad click source and the ad click destination can work out a scheme to tie both 6-bit parts to an individual user. As an alternative to increasing the 12 bit budget, we could also add a new field for additional bits:
Budget Representation Budget representation, or the way the bits are expressed (arbitrary string, hex, decimal …), becomes less of an issue if we split out the |
Just to explain our draft process here, what Jason posted above reflects our joint view. Both of us are editors so we try to coordinate when responding to concrete change proposals such as this. In short:
|
I fully agree with any extra bits being clearly marked as optional; @jasonanovak your "not only compliant but also perceived as compliant" point is well said. I still would hope to find something more tunable that a second I feel like @johnwilander your |
I think the two are different in that Also, in the algorithm case of Are you thinking of an |
I agree the whole point is for there to be verifiably no user tracking possible, which means the entropy of any ad/campaign identifier should be kept a low as possible. Even a combined 32 bits is enough to single-out from most of the planet's population. I suggested in the Web Advertising BG issue 19 w3c/web-advertising#19 The "alphabet" could be quite small as it would only need to be current for some shortish time. If there were 1024 ad/campaign ids in use for a particular publisher (or advertiser) then the report would only need 10 bits and so on.
Each ad/campaign identifier can be an arbitrarily long string, but there are a finite number in the dictionary. The dictionary would always be accessible say as JSON at a .well-known location e.g. in |
@johnwilander If we remain at only two spec'd implementation options (with or without one blob of |
I see. One way would be to give the third attribute in a more general name so that it can cover all the optional cases. Then it would have to support some more intricate syntax of course, for instance: However, given the constraint of significantly less than 32 bits of total entropy, can we really put much more than an event ID in there? Are you thinking of instructions to the browser that will not be part of the attribution report and thus not count toward the entropy? |
Looking further into the future, @csharrison and I are definitely interested in expanding the amount of metadata that can be associated with a click or conversion event to allow aggregate reporting; this only got a brief mention in https://github.com/csharrison/conversion-measurement-api#browser-control-of-information but seems to have a lot of potential. But I don't think we should try to design for that here. |
The more complex things we want to support, the harder it's going to be to squeeze it into element attributes. The anchor tag is a regular HTML element which can have an ID. Maybe we should stick with what's already in the draft and consider a version 2 with a JavaScript API that can push complex data associated with the ad based on its element ID? That would allow you to "talk to the browser" in a much broader way. |
With an extra attribute like |
I did give this a quick thought since support for attributes isn’t feature detectable afaik. Nothing’s going to break through, the data will just be ignored in browsers that don’t support it. But this situation makes me lean toward a programmatic/JavaScript way of expanding with optional data over extra attributes. That way the markup is supported if the feature is supported at all and the rest can be feature detectable. |
@johnwilander and I talked and it seems like there's agreement on Budget Split whereby the 12 bits should be redistributed as 4/4/4 where the middle four bits are negotiated by the two first parties as part of their offline contractual arrangements. We'll work on a PR to clarify that. Regarding Increased Budget, it seems like there are still opens on:
The representation of those bits seems like it has two paths: (1) the In terms of the number of bits, we think that this has to be low, as currently only 4096 browsers can be uniquely identified by Privacy Preserving Ad Click Attribution. Adding just six more bits would increase that number 64 fold to 262,144 browsers that can be uniquely identified, so an optional six bits seems like a cap of entropy we would want to add to the spec for browsers that are willing to take the risk on that high of an entropy (Safari will not). |
Thanks for the discussion; this has helped me understand your underlying philosophical positions. My mental model still feels a little shaky when I read "currently only 4096 browsers can be uniquely identified", though: isn't that how many browsers could be uniquely named in the 12 bits of a conversion report if the publisher and advertiser already had a shared ID for the user? (But in that case they wouldn't use this API at all.) Regarding bit counts, I would love to hear thoughts from potential API users, but doesn't look like any of them are chiming in on this issue. |
I'd like to resolve this issue since we seem to have agreement on a change. Would a PR covering:
… be good enough to close this? |
Sounds good to me, thanks. |
The specifics of how to layer the extra data on top of PCM will be discussed in #26. |
#28 goes into detail on how to split the entropy budget and how to layer the two similar proposals. I think that's where this discussion should continue. Thanks for all the feedback above. It has all gone into where we're taking this. |
Hi John: I'd like to explore ways to make the developer-facing API surface more flexible, so that multiple browsers can provide the same API but still make their own independent decisions on details like the
adcampaignid="..."
value being 6 bits.You currently build the limitation directly into the ad HTML with the requirement "If any of the conditions do not hold, such as the ad campaign id being larger than 6-bit, the request for ad click attribution is ignored." This seems to make it hard to change, both for other implementers now, and for any browser which ever wants to change the limits it places on the API's information flow.
To propose a concrete alternative, how would you feel about the click-time
ad-campaign-id
and the conversion-timead-attribution-data
both being arbitrary strings over some specified alphabet, where the browser limited information flow by truncating the string to some chosen length? If you were to allow 256 or 100 campaign IDs then we could use hex or decimal; if you need to stick to 64 then I guess this API could use octal (chmod fans will love us!). I'm saying truncation, rather than e.g. mod-n, so that a server receiving an ad click attribution HTTP request can easily tell the granularity to which it was limited; of course there are other ways to achieve that goal.To be transparent, the kind of choice I'd like Chrome to be able to make is to allow more information in the (click-time) campaign ID, but compensate for it by restricting to less data on the ad-attribution-data (conversion-time) value. The use cases we've been hearing about in the W3C web-advertising Business Group rely on having more data about what click it was that did or didn't lead to a conversion, and from that point of view the 6+6 bits proposed here isn't the balance we would choose.
Anyway, while I'm happy to debate the specifics of the privacy settings in some other forum, that's not my goal here — I just want us to offer developers a single API across multiple browsers, each of which can come to their own conclusions on thornier questions.
The text was updated successfully, but these errors were encountered: