Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting multiple conversions for one click, a.k.a. reporting windows #95

Open
johnwilander opened this issue Oct 27, 2021 · 4 comments
Assignees

Comments

@johnwilander
Copy link
Collaborator

johnwilander commented Oct 27, 2021

The Attribution Reporting for Click-Through Measurement proposal has the concept of reporting windows.

The spec authors propose that up to three attribution reports can be sent for a single source. My interpretation is that that means for a single source click.

They also provide an example of reporting windows a browser could use:

  • 2 days minus 1 hour: Reports will be sent 2 days from source registration time
  • 7 days minus 1 hour: Reports will be sent 7 days from source registration time

I'd like to explore the idea of reporting windows for PCM where the privacy restrictions are quite different from Attribution Reporting API. PCM does not support user IDs on either side.

Currently, PCM's triggering event is allowed a 4-bit value which goes into the subsequent attribution report. If we were to support two or three reporting windows, the opportunity for cherry-picking of which users to even fire a triggering event for increases and with it the risk of linking a report to a specific user. The risk of being able to link attribution reports from multiple windows also increases. I therefore think the triggering event value needs to be much smaller for reporting windows beyond the first.

Here's a straw man to get us started:

  • Day 0-2, first reporting window: Triggering event gets 4 bits, like today. This window is 7 days today but could be shortened to 2 days as exemplified here.
  • Day 3-7, second reporting window: Triggering event gets 1 bit (high/low) or 1.5 bits (high/mid/low).
  • Day 8-35, third reporting window: Triggering event gets 1 bit (high/low) or 1.5 bits (high/mid/low).

The reason why I bring up such a long window as day 8 to 35 is to open up the conversation on return on ad spend (ROAS). A month's worth of measurement if valuable when the advertising is not about a single purchase but about the longer term value of the acquired customer.

The delay in reporting must scale with the length of the reporting window. Otherwise a bad actor can cherry-pick users to trigger events for in distinct subwindows and know exactly for whom the report is sent when it arrives later. With scale I don't necessarily mean 1:1 scale but for an 8 to 35-day window, the delay would have to be 24 to 168 hours (1 to 7 days) or something like that to deter cherry-picking.

This relationship between length of reporting window and necessary length of delay means diminishing returns when considering even longer report windows.

Let me know what you think, both of the usefulness for advertisers and the privacy implications of such a scheme.

(Also, @csharrison and @johnivdel, please check that I've understood your spec right.)

@johnwilander johnwilander self-assigned this Oct 27, 2021
@johnwilander johnwilander added the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Oct 27, 2021
@dialtone
Copy link

dialtone commented Nov 1, 2021

From a buyer perspective this doesn't change much in the incentive structure. You can see Facebook and Snapchat announcements on their earnings where due to lack of measurement ability spend is moving away from their platforms and into others. This doesn't seem to change much compared to where we are at today regarding measurement on PCM. The greater context to allow longer windows is also to remove the incentive of going after last click measurement and optimization which is the kind that searches and incentivizes the use of personal data and linkages, and one in which contextual-type campaigns typically don't perform as well.

I understand the goal of avoiding entropy for tracking purposes, but a 1-7 day delay on conversion already means that a buyer would need to wait a full 7 days while purchasing traffic before they know anything meaningful about how it performed. On top of it, any change made to such purchase parameters would need to wait at least a further 7 days before anything can be said about its performance. And to further on top, the buyer only gets access to a fairly coarse 1 bit or 1.5 bits after 2 days, that seems pretty rough for whoever is optimizing on the other side, virtually impossible to recollect what drove the purchase of ads on the other side which means that changes to said campaign really need to move one by one at the slowest pace possible making all of this optimization (manual or not) impractical.

I don't think I fully understand your attack vector about cheery-picking windows and users, but I'd like to understand better what is the data leakage there, mostly because on one side you are setting quite severe limitations and thresholds but on the other side I've not seen an impact assessment to go with it.

There are my $0.02 and it's possible my lack of understanding of your explained threat clouds my judgement here a bit.

@johnwilander
Copy link
Collaborator Author

johnwilander commented Dec 8, 2021

Sorry for super long delay here. I was expecting that we'd talk about this on a Privacy CG call but they kept getting catcalled. Now it's up for tomorrow at least.

From a buyer perspective this doesn't change much in the incentive structure. You can see Facebook and Snapchat announcements on their earnings where due to lack of measurement ability spend is moving away from their platforms and into others. This doesn't seem to change much compared to where we are at today regarding measurement on PCM. The greater context to allow longer windows is also to remove the incentive of going after last click measurement and optimization which is the kind that searches and incentivizes the use of personal data and linkages, and one in which contextual-type campaigns typically don't perform as well.

I understand the goal of avoiding entropy for tracking purposes, but a 1-7 day delay on conversion already means that a buyer would need to wait a full 7 days while purchasing traffic before they know anything meaningful about how it performed. On top of it, any change made to such purchase parameters would need to wait at least a further 7 days before anything can be said about its performance. And to further on top, the buyer only gets access to a fairly coarse 1 bit or 1.5 bits after 2 days, that seems pretty rough for whoever is optimizing on the other side, virtually impossible to recollect what drove the purchase of ads on the other side which means that changes to said campaign really need to move one by one at the slowest pace possible making all of this optimization (manual or not) impractical.

I don't understand what you're saying here. What I outlined was up to three attribution reports per measured click, one for day 0-2 after the click, one for day 3-7 after the click, and one for day 8-35 after the click.

The proposed delay of 1 to 7 days before the attribution report would go out would only apply to triggering events in the day 8-35 window. That means that the advertiser a) has potentially already received two attribution reports for this click, and b) has already waited 8-35 days after the click. The 8-35 day window is not about short measurement cycles but about longer term measurement of return on ad spend. Some acquired users will only prove valuable after some time, for instance after a 30-day try-before-you-buy period. Those are the kind of measurements that would have the 1-7 day delay on their attribution reports to deter trying to track individual users by cherry-picking who to call the API for.

I don't think I fully understand your attack vector about cheery-picking windows and users, but I'd like to understand better what is the data leakage there, mostly because on one side you are setting quite severe limitations and thresholds but on the other side I've not seen an impact assessment to go with it.

Imagine we would not increase the time delay for a time window like day 8-35 but keep the 24-48 hour delay we have today. In such a 8-35 day window, the destination website may have learned a lot about the user. That gives a bad actor the opportunity to only trigger conversion for a small set of users, for instance only the ones who've purchased the gold package, only the ones who've reached level 20 in the game, or only the ones who've linked their brokerage account. By doing such cherry-picking and doing it on specific days, the bad actor could know that a specific attribution report is connected to a specific user.

Example: "I only triggered a conversion for Peter, John, and Amanda on December 17 so any reports on December 16-18 will be for them and I'll learn who of those three I acquired through advertising, from which publisher sites, and for which ad campaign."

A key part of the opportunity to do so is the long window of day 8-35 in which the bad actor can trigger the conversion.

By increasing the delay, we significantly limit the opportunity to leverage the long conversion time window for such cherry-picking.

There are my $0.02 and it's possible my lack of understanding of your explained threat clouds my judgement here a bit.

I hope I managed to clarify. As always, thanks for commenting!

@dialtone
Copy link

dialtone commented Dec 9, 2021

Isn't the example you provide with Peter, John and Amanda inevitable in any case? The window of conversion doesn't matter if you only trigger it for a carefully selected subset of users. You can just extend that December 16-18 to 16-22 and your reasoning would be unchanged.

It also doesn't seem to be a particular escalation to go from having actual sensitive information, like your brokerage account, to knowing that you have it now because you clicked an ad 8-35 days ago.

On the other hand, the case for all products that aren't impulse purchases, that you will finalize within 7 days of clicking an ad, is in worse shape than all of those that are impulse purchases because of the additional delay introduced in each subsequent reporting window.

Anyway, I suppose we can chat on the call today :). cheers and thanks for clarifying.

@dialtone
Copy link

dialtone commented Dec 9, 2021

I think I understand the misunderstanding I had on this. I rest my case on the delay, but the decrease of the bits available seems excessive, already there aren't that many available to do much with them.

@TanviHacks TanviHacks removed the agenda+ Request to add this issue to the agenda of our next telcon or F2F label Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants