-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for floating point representation attack #260
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this!
Question on the benchmarking - do you have any intuition around vision_benchmark results? How many runs did you do - could it just be random fluctuations?
As a general comment, we now have a test for the noise lever (privacy_engine_test.py:test_noise_level), can you pls update it to cover secure_mode?
generator=generator, | ||
) # throw away first generated random number | ||
sum = zeros | ||
for i in range(4): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I trust you and Ilya that it solves the problem, but could you pls do ELI5 why this works?
I remember "Option 3" from your doc, but I'm not sure I understand how this relates to it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, the sum from 1 to 4 gives you a Gaussian with variance 4 std^2, thus sum/2 is a gaussian with variance std^2. @ashkan-software : shouldn't you loop over only 2 samples as per the docstring?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question @ffuuugor.
This approach is actually not any of those 3 options we had. I found this approach in a recent paper and me and Ilya think this is an intelligent way of fixing the problem. The idea is to invert the Gaussian mechanism and guess what values used as input to the mechanism. This is possible if Gaussian method is used once. But if we use the Gaussian more than once (in this fix, we call it 4 times), it becomes exponentially harder to guess those values. This is in very simple words, but the fix is a bit more involved and is explained in the paper I listed on the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for having number 4 and 2 in the code, is that when n=2
, we get those values (Section 5.1 in the paper):
sum(gauss(0, 1) for i in range(2 * n)) / sqrt(2 * n)
opacus/tests/randomness_test.py
Outdated
@@ -178,6 +178,7 @@ def _init_training(self, generator, noise: float = 1.0): | |||
max_grad_norm=1.0, | |||
expected_batch_size=8, | |||
generator=generator, | |||
secure_mode=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You can just leave it as it is, since False
is default value
@ffuuugor the only intuition I have for the vision dataset is that I have made a mistake in reporting :) haha. The correct reporting should be: Before the change: Before the change: |
I am now working on the This is the output of the test here in this PR
|
I debugged this issue further, and it seems like this test is flaky: it passes some times and fails some other times. I am simply rerunning it on my computer - it sometimes fails, and sometimes pass. To me it also makes sense, because it is written to validate the output of a random number generator. So it depends on what value is. I also see here 162e7d0 that @ffuuugor has change the test tolerance from 0.05 to 0.1. What's the logic here? |
Note, that we fix the seed in the beginning of the test - so the noise generated should be consistent across machines. Also note, that sometimes (especially on CircleCI) tests fail due to timeout (you should set 0.05 -> 0.1 change was to make it less flaky. The main point of this test is not to check we're doing it correctly now, but to keep checking it in the future so that we don't accidentally break it. The intuition is that we know, that we're adding the correct amount of noise now -> flaky test results indicate problems with the test, not the code |
Update: I added a new test to ensure the sum of 4 gaussians and division by 2 is not changing in the future. I also changed |
I now changed 0.1 > 0.15 😂 so that the flaky test passes. It failed sometimes and sometimes it passed 😂
but seems like we are not really checking strictly, so in the future we may actually break this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, awesome stuff!
Types of changes
Motivation and Context / Related issue
Gaussian mechanism for differential privacy is susceptible to a similar floating point attack against Gaussian noise sampling as the one for Laplace noise as proposed by Mironov in CCS 2011 (https://www.microsoft.com/en-us/research/wp-content/uploads/2012/10/lsbs.pdf)).
The attack is possible since not all real numbers can be represented in floating points, hence not all can be sampled either. The attack is possible since normal noise distribution implementation of pytorch is used. If one observes an output that is protected by Gaussian mechanism, they can determine the noiseless value, hence, invalidating differential private guarantees.
This PR fixes this issue by calling the Gaussian noise function 2*n times, when n=2 (see section 5.1 in https://arxiv.org/abs/2107.10138)
How Has This Been Tested (if it applies)
I ran 3 benchmarks to see if this change impacts the running time, as we call the Gaussian noise 4 times. It has a mild impact on the running time 😁 I ran IMDB on my laptop. I ran MNIST on Google collab. I ran vision benchmark on Google collab with GPU. Here are the summary of those runs (I average over 5 runs each):
Before the change:
MNIST: 2m17s
IMDB: 3m0s
Vision benchmark: 2m9s
Before the change:
MNIST: 2m27s
IMDB: 3m16s
Vision benchmark: 1m48s !!!
commands used to run these:
Checklist