Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Surrogates (fixes #400) #935

Merged
merged 39 commits into from
Oct 6, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
ad30f20
Refactor common property accessors into variables.
ghostwords Aug 25, 2016
08f9cf5
Update copyright year for src/webrequest.js
ghostwords Aug 25, 2016
dbc3773
Add stub surrogate lookup.
ghostwords Aug 25, 2016
75888cc
Fix background script loading order.
ghostwords Sep 8, 2016
0772489
Switch to more compact data URI encoding.
ghostwords Sep 8, 2016
23ea583
Add surrogate for legacy Google Analytics (ga.js).
ghostwords Sep 9, 2016
d3f67b0
Add a unit test.
ghostwords Sep 9, 2016
2302469
Fix indentation.
ghostwords Sep 9, 2016
6bcad20
Fix nit (related to b6c34272).
ghostwords Sep 9, 2016
a1a997e
Add note to fix synchronous XHR in main thread.
ghostwords Sep 14, 2016
3f814a9
Move socialwidgets.json from src/ to data/
ghostwords Sep 15, 2016
d084db4
Split out surrogate definitions into own file.
ghostwords Sep 19, 2016
52c37b7
Fix indentation.
ghostwords Sep 19, 2016
4f27efe
Upgrade to hostname + suffix token-based checking.
ghostwords Sep 19, 2016
0910a93
Replace repeated calls to get request hostname.
ghostwords Sep 19, 2016
773091e
ES6 tweaks.
ghostwords Sep 19, 2016
156a2ab
Add more unit tests.
ghostwords Sep 26, 2016
1e1ac8d
Ignore querystrings when suffix matching.
ghostwords Sep 26, 2016
8bd1a08
Add some documentation.
ghostwords Sep 26, 2016
733eb8f
Improve variable name.
ghostwords Sep 26, 2016
8b81fea
Add TODO.
ghostwords Sep 26, 2016
4158e9d
Make tests more readable.
ghostwords Sep 27, 2016
19f54f4
Add test for returned string being a JS data URI.
ghostwords Sep 27, 2016
0c02006
Update ga.js surrogate.
ghostwords Sep 27, 2016
a88a76a
Add b.scorecardresearch.com surrogates.
ghostwords Sep 27, 2016
f9ab56d
Merge branch 'master' into surrogates
ghostwords Sep 27, 2016
3864d77
Fix typo.
ghostwords Sep 27, 2016
44f5b97
Add WIP integration test.
ghostwords Sep 27, 2016
636c06d
Integration test WIP.
ghostwords Sep 27, 2016
0438f8d
Finish integration test.
ghostwords Sep 30, 2016
23f0e60
Rename integration test file.
ghostwords Sep 30, 2016
7adb3f0
Merge branch 'master' into surrogates
ghostwords Oct 3, 2016
d5abfc6
Merge branch 'master' into surrogates
ghostwords Oct 3, 2016
1a94f69
Add attribution link to comScore surrogate.
ghostwords Oct 4, 2016
7d15ebb
Merge branch 'master' into surrogates
ghostwords Oct 6, 2016
95ea557
Move integration test to own servers.
ghostwords Oct 6, 2016
f11cdf6
Tweak global var declaration.
ghostwords Oct 6, 2016
8e290cd
Merge branch 'master' into surrogates
ghostwords Oct 6, 2016
40d0a84
Fix integration test.
ghostwords Oct 6, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
File renamed without changes.
183 changes: 183 additions & 0 deletions data/surrogates.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
/*
*
* This file is part of Privacy Badger <https://www.eff.org/privacybadger>
* Copyright (C) 2016 Electronic Frontier Foundation
*
* Privacy Badger is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 3 as
* published by the Free Software Foundation.
*
* Privacy Badger is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Privacy Badger. If not, see <http://www.gnu.org/licenses/>.
*/

require.scopes.surrogatedb = (function() {

// "hostnames" maps hostnames to arrays of surrogate pattern tokens.
//
// A hostname can have one or more surrogate scripts.
//
// Surrogate pattern tokens are used to look up the actual
// surrogate script code (stored in "surrogates" object below).
const hostnames = {
'b.scorecardresearch.com': [
'/beacon.js',
'/c2/plugins/streamsense_plugin_html5.js',
],
'ssl.google-analytics.com': [
'/ga.js',
],
'www.google-analytics.com': [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This data structure should be better commented.

'/ga.js',
],
};

// "surrogates" maps surrogate pattern tokens to surrogate script code.
//
// There is currently one type of surrogate pattern token: suffix.
// Does the script URL (querystring excluded) end with the token?
const surrogates = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More documentation needed here as well.

/* eslint-disable no-extra-semi */

// Google Analytics (legacy ga.js)
//
// sourced from https://github.com/uBlockOrigin/uAssets/ under GPLv3
// https://github.com/uBlockOrigin/uAssets/blob/f79f3e69c1e20c47df1876efe2dd43027bf05b89/filters/resources.txt#L162-L256
//
// test cases:
// http://checkin.avianca.com/
// https://www.vmware.com/support/pubs/ws_pubs.html (release notes links)
//
// API reference:
// https://developers.google.com/analytics/devguides/collection/gajs/methods/
'/ga.js': '(' +
function() {
var noopfn = function() {
;
};
//
var Gaq = function() {
;
};
Gaq.prototype.Na = noopfn;
Gaq.prototype.O = noopfn;
Gaq.prototype.Sa = noopfn;
Gaq.prototype.Ta = noopfn;
Gaq.prototype.Va = noopfn;
Gaq.prototype._createAsyncTracker = noopfn;
Gaq.prototype._getAsyncTracker = noopfn;
Gaq.prototype._getPlugin = noopfn;
Gaq.prototype.push = function(a) {
if ( typeof a === 'function' ) {
a(); return;
}
if ( Array.isArray(a) === false ) {
return;
}
// https://twitter.com/catovitch/status/776442930345218048
// https://developers.google.com/analytics/devguides/collection/gajs/methods/gaJSApiDomainDirectory#_gat.GA_Tracker_._link
if ( a[0] === '_link' && typeof a[1] === 'string' ) {
window.location.assign(a[1]);
}
};
//
var tracker = (function() {
var out = {};
var api = [
'_addIgnoredOrganic _addIgnoredRef _addItem _addOrganic',
'_addTrans _clearIgnoredOrganic _clearIgnoredRef _clearOrganic',
'_cookiePathCopy _deleteCustomVar _getName _setAccount',
'_getAccount _getClientInfo _getDetectFlash _getDetectTitle',
'_getLinkerUrl _getLocalGifPath _getServiceMode _getVersion',
'_getVisitorCustomVar _initData _link _linkByPost',
'_setAllowAnchor _setAllowHash _setAllowLinker _setCampContentKey',
'_setCampMediumKey _setCampNameKey _setCampNOKey _setCampSourceKey',
'_setCampTermKey _setCampaignCookieTimeout _setCampaignTrack _setClientInfo',
'_setCookiePath _setCookiePersistence _setCookieTimeout _setCustomVar',
'_setDetectFlash _setDetectTitle _setDomainName _setLocalGifPath',
'_setLocalRemoteServerMode _setLocalServerMode _setReferrerOverride _setRemoteServerMode',
'_setSampleRate _setSessionTimeout _setSiteSpeedSampleRate _setSessionCookieTimeout',
'_setVar _setVisitorCookieTimeout _trackEvent _trackPageLoadTime',
'_trackPageview _trackSocial _trackTiming _trackTrans',
'_visitCode'
].join(' ').split(/\s+/);
var i = api.length;
while ( i-- ) {
out[api[i]] = noopfn;
}
out._getLinkerUrl = function(a) {
return a;
};
return out;
})();
//
var Gat = function() {
;
};
Gat.prototype._anonymizeIP = noopfn;
Gat.prototype._createTracker = noopfn;
Gat.prototype._forceSSL = noopfn;
Gat.prototype._getPlugin = noopfn;
Gat.prototype._getTracker = function() {
return tracker;
};
Gat.prototype._getTrackerByName = function() {
return tracker;
};
Gat.prototype._getTrackers = noopfn;
Gat.prototype.aa = noopfn;
Gat.prototype.ab = noopfn;
Gat.prototype.hb = noopfn;
Gat.prototype.la = noopfn;
Gat.prototype.oa = noopfn;
Gat.prototype.pa = noopfn;
Gat.prototype.u = noopfn;
var gat = new Gat();
window._gat = gat;
//
var gaq = new Gaq();
(function() {
var aa = window._gaq || [];
if ( Array.isArray(aa) ) {
while ( aa[0] ) {
gaq.push(aa.shift());
}
}
})();
window._gaq = gaq.qf = gaq;
} + ')();',

// https://github.com/gorhill/uBlock/issues/1265
// https://github.com/uBlockOrigin/uAssets/blob/581f2c93eeca0e55991aa331721b6942f3162615/filters/resources.txt#L736-L746
'/beacon.js': '(' +
function() {
window.COMSCORE = {
purge: function() {
_comscore = []; // eslint-disable-line no-undef
},
beacon: function() {
;
}
};
} + ')();',

// http://www.dplay.se/ett-jobb-for-berg/ (videos)
'/c2/plugins/streamsense_plugin_html5.js': '(' +
function() {
} + ')();',

/* eslint-enable no-extra-semi */
};

const exports = {
hostnames: hostnames,
surrogates: surrogates,
};

return exports;
})();
2 changes: 1 addition & 1 deletion doc/DESIGN-AND-ROADMAP.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,7 @@ by third party origins with local, static equivalents that either replace the
original widget faithfully, or create a click-through step before the widget
is loaded and tracks the user.

The widget replacement table lives in the [socialwidgets.json file](https://github.com/EFForg/privacyBadgerchrome/blob/master/src/socialwidgets.json).
The widget replacement table lives in the [socialwidgets.json file](https://github.com/EFForg/privacyBadgerchrome/blob/master/data/socialwidgets.json).
Widgets are replaced unless the user has chosen to specifically allow that third party
domain (by moving the slider to 'green' in the UI), so users can selectively
disable this functionality if they wish. The code for social media widgets is
Expand Down
2 changes: 2 additions & 0 deletions manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,8 @@
"lib/jsbn.js",
"lib/rsa.js",
"lib/underscore-min.js",
"data/surrogates.js",
"src/surrogates.js",
"src/multiDomainFirstParties.js",
"src/incognito.js",
"src/constants.js",
Expand Down
2 changes: 1 addition & 1 deletion src/background.js
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ var HeuristicBlocking = require("heuristicblocking");
var webrequest = require("webrequest");

var SocialWidgetLoader = require("socialwidgetloader");
var SocialWidgetList = SocialWidgetLoader.loadSocialWidgetsFromFile("src/socialwidgets.json"); // eslint-disable-line no-unused-vars
window.SocialWidgetList = SocialWidgetLoader.loadSocialWidgetsFromFile("data/socialwidgets.json");

var Migrations = require("migrations").Migrations;
var incognito = require("incognito");
Expand Down
2 changes: 2 additions & 0 deletions src/socialwidgetloader.js
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,8 @@ function getFileContents(filePath) {
var url = chrome.extension.getURL(filePath);

var request = new XMLHttpRequest();
// TODO replace synchronous main thread XHR with async
// TODO https://xhr.spec.whatwg.org/#sync-warning
request.open("GET", url, false);
request.send();

Expand Down
77 changes: 77 additions & 0 deletions src/surrogates.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/*
*
* This file is part of Privacy Badger <https://www.eff.org/privacybadger>
* Copyright (C) 2016 Electronic Frontier Foundation
*
* Privacy Badger is free software: you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 3 as
* published by the Free Software Foundation.
*
* Privacy Badger is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with Privacy Badger. If not, see <http://www.gnu.org/licenses/>.
*/

require.scopes.surrogates = (function() {

const db = require('surrogatedb');

/**
* Blocking tracking scripts (trackers) can cause parts of webpages to break.
* Surrogate scripts are dummy pieces of JavaScript meant to supply just enough
* of the original tracker's functionality to allow pages to continue working.
*
* This method gets called within request-blocking listeners:
* It needs to be fast!
*
* @param {String} script_url The full URL of the script resource being requested.
*
* @param {String} script_hostname The hostname component of the script_url
* parameter. This is an optimization: the calling context should already have
* this information.
*
* @return {String|Boolean} The surrogate script as a data URI when there is a
* match, or boolean false when there is no match.
*/
function getSurrogateURI(script_url, script_hostname) {
// do we have an entry for the script hostname?
if (db.hostnames.hasOwnProperty(script_hostname)) {
const tokens = db.hostnames[script_hostname];

// do any of the pattern tokens for that hostname match the script URL?
for (let i = 0; i < tokens.length; i++) {
const token = tokens[i],
qs_start = script_url.indexOf('?');

let match = false;

if (qs_start == -1) {
if (script_url.endsWith(token)) {
match = true;
}
} else {
if (script_url.endsWith(token, qs_start)) {
match = true;
}
}

if (match) {
// there is a match, return the surrogate code
return 'data:application/javascript;base64,' + btoa(db.surrogates[token]);
}
}
}

return false;
}

const exports = {
getSurrogateURI: getSurrogateURI,
};

return exports;
})();
Loading