Hi Alex,
I think I have figured out how to use Opacus in federated learning. Thanks for your helpful information.
I create a client class, I attached a privacy_engine to each client, and at each training round of FL, I just load the model state from the global model, keeping the privacy_engine untouched
class client(object):
def __init__(self, args, global_model, train_loader, .test_loader ):
self.args = args
self.eps = {}
self.delta = 1 / (1.1*len(train_set))
# copy model parameters from global_model
model = copy.deepcopy(global_model)
# create privacy_engine
self.privacy_engine = PrivacyEngine()
self.model, self.optimizer, self.train_loader = self.privacy_engine.make_private(
module=model,
optimizer=torch.optim.SGD(model.parameters(), self.args.lr, momentum=0.5),
data_loader=self.train_loader,
noise_multiplier =self.args.noise_multiplier,
max_grad_norm=self.args.max_grad_norm)
def update_model(self, global_round, model_weights):
self.model.load_state_dict(model_weights)
self.model.train()
with BatchMemoryManager(data_loader=self.train_loader,
max_physical_batch_size=self.args.max_physical_batch_size,
optimizer=self.optimizer) as memory_safe_data_loader:
train_results = self.train(self.model, self.optimizer, memory_safe_data_loader)
#record privacy cost spend
self.eps[global_round] = self.privacy_engine.get_epsilon(delta=self.delta)
return train_results
def train(self, model, optimizer, dataloader):
epoch_loss, epoch_acc = [], []
for epoch in range(self.args.local_ep):
batch_loss, batch_acc = [], []
for batch_idx, (images, target) in enumerate(dataloader):
optimizer.zero_grad()
images, target = images.to(self.device), target.to(self.device)
output = model(images)
loss = nn.NLLLoss(output, target)
loss.backward()
optimizer.step()
return model.state_dict()
In the main function, I first instantiate clients, so each client has their own privacy_engine
if __name__ == '__main__':
<.... some other codes ...>
# instantiate clients
client_lst = []
for cid in range(args.num_clients):
client_lst.append(client(cid, args, global_model, train_loader,test_loader))
# the server selects clients to collaboratively train model
for epoch in range(args.epochs):
local_weights_lst = []
m = max(int(args.frac * args.num_clients), 1)
selected_clients = np.random.choice(range(args.num_clients), m, replace=False)
for cid in selected_clients:
local_weights = client_lst[cid].update_model(
epoch, global_weights, args.noise_multiplier)
local_weights_lst.append(copy.deepcopy(local_weights))
# aggregate local model weights to get global model weights
global_weights = average_weights(local_weights_lst)
And the code runs well with no errors. In regards to your suggestion of get_noise_multiplier()
, I think it is worth trying. The original question is solved and thanks again for your help.
Actually, there is another question that surfaces: I see from another question How to store the state and resume the state of the PrivacyEngine? - #2 by ffuuugor which is about store and resume the state of privacy_engine, I’m wondering a such question:
If I first set noise_multipler = 1.1, run the training with DP-SGD and stored the state of privacy_engine. Then resumed the state of privacy_engine and additionally, set noise_multipler = 1.2 (another value different from the previous choice) and continue training with DP-SGD. Will Opacus be able to calculate the privacy cost spend so far ?
From the Algorithm 1
of the original paper Deep Learning with Differential Privacy, I think they fixed the level of noise_multipler, so I assume the answer is no?