Opacus for 3D Segmentation

Ziva1011 · September 9, 2024, 8:34pm

Dear Opacus community,

I’ve been looking into 3D segmentation models for medical imaging. I use several architectures, half of them from the Monai library.

When I try to run Opacus in combination with these architectures I get the following error:

Traceback (most recent call last):
  File "/vol/aimspace/users/viulapir/Documents/dp-thesis/test2.py", line 473, in main
    training_losses, validation_losses, lr_rates = trainer.run_trainer()
  File "/vol/aimspace/users/viulapir/Documents/dp-thesis/trainer.py", line 48, in run_trainer
    self._train()
  File "/vol/aimspace/users/viulapir/Documents/dp-thesis/trainer.py", line 106, in _train
    self.optimizer.step()  # update the parameters
  File "/u/home/viulapir/.conda/envs/torchsegmentation2/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 553, in step
    if self.pre_step():
  File "/u/home/viulapir/.conda/envs/torchsegmentation2/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 536, in pre_step
    if self.grad_samples is None or len(self.grad_samples) == 0:
  File "/u/home/viulapir/.conda/envs/torchsegmentation2/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 342, in grad_samples
    ret.append(self._get_flat_grad_sample(p))
  File "/u/home/viulapir/.conda/envs/torchsegmentation2/lib/python3.10/site-packages/opacus/optimizers/optimizer.py", line 279, in _get_flat_grad_sample
    raise ValueError(
ValueError: Per sample gradient is not initialized. Not updated in backward pass?

The model that generated this error above was the Vnet (but this happens with other architectures too).

I now that Monai uses batch normalization layers so I used ModuleValidator.fix(model) to convert them to group normalization layers. I also tried to initialize the weights using torch.init but it produced the same error

Simplified version of the code is as below.

 model= VNet( spatial_dims=3, in_channels=1, out_channels=3)
model = ModuleValidator.fix(model)
        optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)

        privacy_engine = PrivacyEngine(accountant="gdp")
      
        
        model, optimizer, train_dl = privacy_engine.make_private_with_epsilon(
            module=model,
            optimizer=optimizer,
            data_loader=train_dl,
            target_epsilon=8,
            target_delta=1e-2,
            epochs=epochs,
            max_grad_norm=1
        )
        train_dl.collate_fn = wrap_collate_with_empty(
            collate_fn=list_data_collate,
            sample_empty_shapes={x: train_dl.dataset[0][x].shape for x in ["img", "seg"]},
            dtypes={x: train_dl.dataset[0][x].dtype for x in ["img", "seg"]},
        )

Does anyone have an idea of what i am missing here? Or is opacus not compatible with these models and I should raise a feature request on Github?

Thank you!