Dear Opacus community,
I’ve been looking into 3D segmentation models for medical imaging. I use several architectures, half of them from the Monai library.
When I try to run Opacus in combination with these architectures I get the following error:
Traceback (most recent call last):
File "/vol/aimspace/users/viulapir/Documents/dp-thesis/test2.py", line 473, in main
training_losses, validation_losses, lr_rates = trainer.run_trainer()
File "/vol/aimspace/users/viulapir/Documents/dp-thesis/trainer.py", line 48, in run_trainer
self._train()
File "/vol/aimspace/users/viulapir/Documents/dp-thesis/trainer.py", line 106, in _train
self.optimizer.step() # update the parameters
File "/u/home/viulapir/.conda/envs/torchsegmentation2/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 553, in step
if self.pre_step():
File "/u/home/viulapir/.conda/envs/torchsegmentation2/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 536, in pre_step
if self.grad_samples is None or len(self.grad_samples) == 0:
File "/u/home/viulapir/.conda/envs/torchsegmentation2/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 342, in grad_samples
ret.append(self._get_flat_grad_sample(p))
File "/u/home/viulapir/.conda/envs/torchsegmentation2/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 279, in _get_flat_grad_sample
raise ValueError(
ValueError: Per sample gradient is not initialized. Not updated in backward pass?
The model that generated this error above was the Vnet (but this happens with other architectures too).
I now that Monai uses batch normalization layers so I used ModuleValidator.fix(model) to convert them to group normalization layers. I also tried to initialize the weights using torch.init but it produced the same error
Simplified version of the code is as below.
model= VNet( spatial_dims=3, in_channels=1, out_channels=3)
model = ModuleValidator.fix(model)
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
privacy_engine = PrivacyEngine(accountant="gdp")
model, optimizer, train_dl = privacy_engine.make_private_with_epsilon(
module=model,
optimizer=optimizer,
data_loader=train_dl,
target_epsilon=8,
target_delta=1e-2,
epochs=epochs,
max_grad_norm=1
)
train_dl.collate_fn = wrap_collate_with_empty(
collate_fn=list_data_collate,
sample_empty_shapes={x: train_dl.dataset[0][x].shape for x in ["img", "seg"]},
dtypes={x: train_dl.dataset[0][x].dtype for x in ["img", "seg"]},
)
Does anyone have an idea of what i am missing here? Or is opacus not compatible with these models and I should raise a feature request on Github?
Thank you!