The proposed ePrivacy Regulation, like the ePrivacy Directive before it, requires that prior user consent must be obtained before tracking cookies are stored or used. The associated General Data Protection regulation requires that no personal data is processed unless at least one of a small number of legal bases applies, the primary one being that the user has given consent. In the case of the GDPR, even if another legal basis is claimed, such as the “legitimate interest of the data controller, processing must not override the fundamental privacy rights of individuals, and they must always be offered a “right to object” using “automated means”. These laws are due to be in effect in May 2018 and will be enforced in courts with very large fines, or class-action scale damages.
It is easier for first-party websites to obtain their user’s consent, because they have the wherewithal to present the necessary information and a suitable User Interface to register agreement, but users are still far more likely to agree to having their online activity tracked if they know it is restricted to particular sites, i.e. the request is not seen as just a Trojan Horse for exposure, often to unknown companies, across the entire web.
When we talk about data sharing in the context of a webpage we usually mean UID sharing i.e. an origin specific unique value encoded in user agent storage, e.g. in an HTTP cookie.
Unfortunately, the current specification for HTTP cookies does not differentiate between first-party and third-party accesses. Irrespective of context, once a cookie is associated with a particular domain (“origin”) then it will be communicated in all requests to that origin. The same third-party sub-resources are often embedded on very many separate websites so once such a “third-party” cookie is stored it can be used to report on subsequent visits to any of these websites.
This “bug” in the specification of HTTP cookies enabled the explosion of invisible user cross-origin tracking. In order to mitigate the risk of this, some browsers allow the user to set the browser to restrict the storage of, or block, “third-party” cookies, but this is both inadequate as a privacy measure and seriously diminishes the capabilities of the web platform. Not only can third-party cookie blocking break applications such as federated login or the use of localStorage for non-identifying audience categorisation, it is easily circumvented through the inclusion of third-party script on sites. This script can use standard APIs to arrange for values encoded in first-party cookies to be communicated to third-party servers, by using use link redirection or dynamic URI insertion to synchronise with third-party cookies (“cookie synching”).
The Do-Not-Track Tracking Preference Expression document allows for the registration and communication of site-specific consent i.e. a user is asked for consent to their data being shared with particular third-party sub-resources, but only within the context of a particular web site. The problem is that there is no easy way to implement this using HTTP cookies. Although servers can stop using UID cookies when the DNT:1 header exists (or perhaps in Europe if the DNT:0 header does not exist), a server receiving DNT:0 could theoretically store a UID which then would be capable of tracking the use across the entire web, on sites where user consent has not been obtained.
What is needed is for an efficient way to communicate a UID amongst a group of origins but only within the context of a first-party site.
There has been talk of "double keyed" cookies, the ability to segregate sub-resource cookies into first-party scoped cookie stores, but while there is some activity in Mozilla on this, it has not yet become a W3C issue. There is some discussion about referring to them in the replacement for the IETF cookie standard (RFC 6265bis)
I propose is that we describe the format of a user unique identifier as a DNT extension, already envisaged in the TPE for unspecified future uses, and allow it to be set at the same time the site-specific exception is stored. The UID would only exist when consent has been given (i.e. DNT is “0”) for that origin and would be automatically erased on expiry or when consent was revoked.
The UID would be created by the user agent when first-party script calls the site-specific API, i.e. there would not be any way to specify identifier value. Its calculated (by the browser) value would be returned to the calling first-party script in the resolution to the returned Promise.
Additionally there could also be a low-entropy form of the identifier for communicating audience categorisation data, which would be capable of being specified by the API caller but restricted to a small number of bits (say <16), so that it cannot be used for arbitrary tracking.
The obvious use case is online publishing. Advertising is an important revenue stream for them. Though it does not have to be behavioural there are still personal data protection issues with attribution and viewability. If the data sharing needed for this could be better explained, i.e. it would only ever apply within the first-party context, users (i.e. subscribers) would be more likely to give their consent. This could be a boon to the online publishing ecosystem in that it both increases the value to advertisers of their subscription base and simultaneously stops the exfiltration of user identifying data enabling users to be targeted on lower CPM sites.