Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Editorial: use the term "IP geolocation" #28

Merged
merged 2 commits into from
Oct 30, 2023
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions Explainer-GeoIP.md → Explainer-IP-Geolocation.md
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
# Explainer: GeoIP Approach for IP Protection
# Explainer: IP Geolocation Approach for IP Protection

## Context and Goals

Location can be a key identifier for understanding what content is relevant for users. As Chrome prepares to [deprecates third party cookies](https://privacysandbox.com/), the Chrome team wants to ensure that we are taking appropriate steps to improve privacy on the web while also maintaining key use cases. We have [shared our proposal](https://github.com/GoogleChrome/ip-protection) for masking IP addresses for certain third party domains on the web. This comes with additional implications, like how IP addresses are used to understand the location of [GitHub - GoogleChrome/ip-protection](https://github.com/GoogleChrome/ip-protection) web users. This document outlines our initial proposal for how IP Protection will implement GeoIP mappings.
Location can be a key identifier for understanding what content is relevant for users. As Chrome prepares to [deprecates third party cookies](https://privacysandbox.com/), the Chrome team wants to ensure that we are taking appropriate steps to improve privacy on the web while also maintaining key use cases. We have [shared our proposal](https://github.com/GoogleChrome/ip-protection) for masking IP addresses for certain third party domains on the web. This comes with additional implications, like how IP addresses are used to understand the location of [GitHub - GoogleChrome/ip-protection](https://github.com/GoogleChrome/ip-protection) web users. This document outlines our initial proposal for how IP Protection will implement IP geolocation mappings.

It’s worth noting that GeoIP —with or without IP Protection— only provides approximate and coarse location information and that there are ways in which users can obfuscate this data. With this proposal, we aim to improve user privacy while maintaining most of the existing uses of GeoIP as a coarse location signal. Addressing pre-existing accuracy issues of GeoIP is not a goal and generally, this proposal will inherit the prior limitations of IP as a source of location information.
It’s worth noting that IP geolocation—with or without IP Protection—only provides approximate and coarse location information and that there are ways in which users can obfuscate this data. With this proposal, we aim to improve user privacy while maintaining most of the existing uses of IP geolocation as a coarse location signal. Addressing pre-existing accuracy issues of IP geolocation is not a goal and generally, this proposal will inherit the prior limitations of IP as a source of location information.

## How GeoIP information will be shared
## How IP geolocation information will be shared

Geo assignments of the IP addresses exposed by IP Protection will be shared publicly via a geofeed file. An example of what this file will look like can be found at [this link](https://www.gstatic.com/ipprotection/geofeed_template). The geofeed will use the format defined in [[RFC 8805](https://datatracker.ietf.org/doc/html/rfc8805)], and will provide city-level mappings. These city-level mappings correspond to top cities, each representing a geographic area around that city.

## Defining and Dividing Geographies

We have divided the entire geographic area where IP Protection might be available into areas with large enough populations of Internet users to ensure individual users remain anonymous. We aimed to define areas where we observe at least one million users over a two week period across Google properties, which we use as a proxy for the number of Internet users in that region. Note that this estimation can differ significantly from census population data or other sources due to a range of factors, for example the presence of temporary visitors or if a person uses multiple digital profiles or accounts. For example, in the US this leads to a subdivision of the country into ~700 geographic areas. Since the U.S. has approximately 330 million people, this would equate to roughly 470,000 people per geo on average.

Each geographic area is represented in the geofeed by its most populous city. We have aimed to preserve the most popular cities across the globe by Internet population, in order to maximize the utility of the GeoIP data while improving user privacy. Note that, since we aim for the areas to have a minimum size in terms of Internet users, the areas will be geographically smaller in size in very densely populated areas and larger in sparse areas.
Each geographic area is represented in the geofeed by its most populous city. We have aimed to preserve the most popular cities across the globe by Internet population, in order to maximize the utility of the IP geolocation data while improving user privacy. Note that, since we aim for the areas to have a minimum size in terms of Internet users, the areas will be geographically smaller in size in very densely populated areas and larger in sparse areas.

Note that the assigned GeoIP for a user would maintain country borders to our best knowledge based on the user’s original IP. For example, a user that appears in Windsor (Canada) according to their original IP address would not be assigned to Detroit (US), despite the geographical proximity. This rule applies for all countries, including those that may have a total population below the established threshold.
Note that the assigned IP geolocation for a user would maintain country borders to our best knowledge based on the user’s original IP. For example, a user that appears in Windsor (Canada) according to their original IP address would not be assigned to Detroit (US), despite the geographical proximity. This rule applies for all countries, including those that may have a total population below the established threshold.

!["The map is subdivided into areas delimited by the blue boundaries in the US and green boundaries in Canada. Users within a certain area will be assigned an IP address that is mapped to the top city of that area, marked with the pin. An area will never cross a country border."](./geo-example.png)

Expand All @@ -26,7 +26,7 @@ _Image 1. Illustrative mapping of Detroit area. The image shows how country boun

## Country, State, and Sub-Country Mapping

As mentioned earlier, GeoIP can be useful as a coarse location signal but it is not exactly precise and there have always been mechanisms for users to obfuscate this data. As a result, GeoIP can’t be used to guarantee country, state or other sub-country divisions with 100% accuracy. This limitation is also true of IP Protection, and it remains the responsibility of companies to ensure they are meeting any applicable regulations or other obligations in each jurisdiction.
As mentioned earlier, IP geolocation can be useful as a coarse location signal but it is not exactly precise and there have always been mechanisms for users to obfuscate this data. As a result, IP geolocation can’t be used to guarantee country, state or other sub-country divisions with 100% accuracy. This limitation is also true of IP Protection, and it remains the responsibility of companies to ensure they are meeting any applicable regulations or other obligations in each jurisdiction.

Having said that, we have designed our geo mappings to provide best effort accuracy in terms of country and region mappings, while preserving privacy by ensuring sufficiently large geographic areas at the sub-country level.

Expand All @@ -38,14 +38,14 @@ Users will be assigned to a geographic area based on their pre-proxy IP address

## Preserving Privacy and Utility

We have designed our GeoIP approach to balance privacy expectations while trying to preserve the maximum utility for the various uses of GeoIP, such as ads personalization, analytics or compliance.
We have designed our IP geolocation approach to balance privacy expectations while trying to preserve the maximum utility for the various uses of IP geolocation, such as ads personalization, analytics or compliance.

In order to ensure a highly privacy-preserving approach, we propose a threshold of one million unique web cookies (over a two week period) to determine the geographic areas that users will be mapped to. We estimate that, at this size, the regions are big enough to preserve privacy such that individual users can’t be tracked or identified based on the IPs that are being assigned to their requests when using the IP Protection system. We’ve also taken into account location privacy considerations, such that the precision revealed by GeoIP aligns with users’ expectations, even in densely populated areas.
In order to ensure a highly privacy-preserving approach, we propose a threshold of one million unique web cookies (over a two week period) to determine the geographic areas that users will be mapped to. We estimate that, at this size, the regions are big enough to preserve privacy such that individual users can’t be tracked or identified based on the IPs that are being assigned to their requests when using the IP Protection system. We’ve also taken into account location privacy considerations, such that the precision revealed by IP geolocation aligns with users’ expectations, even in densely populated areas.


## Have feedback?

We welcome your feedback on this proposal. Please use the following links to provide your input in our GitHub repository:
* [Impacts of proposed GeoIP granularity](https://github.com/GoogleChrome/ip-protection/issues/3)
* [Impacts of proposed GeoIP granularity for regulatory & contractual use cases](https://github.com/GoogleChrome/ip-protection/issues/2)
* [Impacts of proposed IP geolocation granularity](https://github.com/GoogleChrome/ip-protection/issues/3)
* [Impacts of proposed IP geolocation granularity for regulatory & contractual use cases](https://github.com/GoogleChrome/ip-protection/issues/2)