Double charging privacy in DCGAN example

alexbie · April 14, 2022, 2:28am

Hi, Opacus is a bit of a black box to me, so it would be great if you could clarify a few things about its internals to me.

First in the DCGAN example

github.com

pytorch/opacus/blob/main/examples/dcgan.py

#!/usr/bin/env python3
# Copyright (c) Meta Platforms, Inc. and affiliates.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""
Runs DCGAN training with differential privacy.

"""
from __future__ import print_function

This file has been truncated. show original

After privatizing the dataloader, Discriminator D, and D’s Optimizer, each training iteration poisson samples ‘real’ with rate q and looks like this

# (1) train on generated data
fake_loss = criterion(D(fake))
fake_loss.backward()
optimizerD.step()
optimizerD.zero_grad()

# (2) train on real (private) data
real_loss = criterion(D(real))
real_loss.backward()
optimizerD.step()

When I vary the number of times I do (1), the privacy cost increases, although it shouldn’t because training D on fake data has no dependence on the private real data. It should be a free operation by post-processing.

An alternative approach is to do the following, in each iteration

loss = criterion(D(fake)) + criterion(D(real))
loss.backward()
optimizerD.step()

Now I am curious about the semantics of what happens here. Are we clipping all the gradients (assume to 1), summing them up, adding noise with sigma = noise_multiplier, and then finally dividing everything by expected batch size (qN)? If so I believe this approach satisfies DP without overcharging privacy.

ffuuugor · April 25, 2022, 9:44am

Hey
Come to think of it, I think you’re right here - (1) should be free from the privacy standpoint and we indeed overcharge privacy budget.
Speaking of the solutions, my first instinct is to have two separate optimizers for the discriminator. Both will be covering the same set of parameters, but one would be private (DPOptimizer), the second - regular.
The proposal you’re describing could work as well, assuming you actually want to clip gradients for the fake data - you don’t have to from the privacy perspective, but I’m not sure what’s the best approach for the model quality.

I’ve created a github issue to track this: #418. If you want, feel free to send a PR with your approach. Alternatively, someone else will pick it up later

Thanks for reporting!

alexbie · April 29, 2022, 2:12am

So the second approach i suggested is indeed private?