PyTorch Lightning Support?

James_M · March 2, 2021, 5:47pm

I’m trying to utilise opacus with the PyTorch Lightning framework which we use as a wrapper around a lot of our models. I can see that there was an effort to integrate this partially into PyTorch Lightning late last year but this seems to have stalled due to lack of bandwidth.

I’ve created a simple MVP but there seems to be a compatibility problem with even this simple model; it throws AttributeError: 'Parameter' object has no attribute 'grad_sample' as soon as it hits the optimization step.

What’s the likely underlying cause of this? I can see on the opacus GitHub that similar errors have been encountered before where it’s been caused by unsupported layers but as the gist shows, this model is incredibly simple so I don’t think it’s any of the layers.

This is with:

opacus==0.11.0
pytorch-lightning==1.2.1
torch==1.7.1
torchaudio==0.7.2
torchvision==0.8.2

Darktex · March 3, 2021, 10:15pm

Hi James! Would you have time to file a bug and share a colab so I can take a look? Integrating with Lightning is indeed on our plate

James_M · March 5, 2021, 5:28pm

Sure thing; didn’t want to go straight to a bug report as it didn’t feel right

NiWaRe · August 23, 2021, 11:48am

Hi @Darktex @James_M,
was there some progress made with the interaction with PyTorch Lightning?
I wrote a single integration myself also using the LightningCLI. I’m initializing the PrivacyEngine in the before_fit hook of a custom LightningCLI and attaching it to the configure_optimizer function of a typical LightningModule.
It seems to work, but I would be curious if there’s a best practice way of integration.
Regards

sayanghosh · August 23, 2021, 3:58pm

Hello @NiWaRe @James_M e have not worked yet on integrating PyTorch Lightning with Opacus. As mentioned, it is on our roadmap. Meanwhile, if you could share your changes either in Google Colab or send out a pull request in Github we would consider your changes when we are ready to start work on this.

amin-nejad · September 10, 2021, 9:58am

Thanks for describing your solution @NiWaRe, do you have a code snippet you could share?

NiWaRe · September 10, 2021, 11:48am

@sayanghosh @amin-nejad sorry was busy last weeks. I can gladly share the snippets, concerning PR should I rather commit to a tutorial in a Jupyter Notebook or think about how to integrate it directly into the framework without the need to overwrite different hooks?

What I have for now (prototyping code, the hparams are the defined params in my PL Model):

class LightningCLI_Custom(LightningCLI):
[...]
def before_fit(self):
    """Hook to run some code before fit is started"""
    # possible because self.datamodule and self.model are instantiated beforehand
    # in LightningCLI.instantiate_trainer(self) -- see docs

    # TODO: why do I have to call them explictly here 
    #       -- in docs not mentioned (not found in .trainerfit())
    self.datamodule.prepare_data()
    self.datamodule.setup()

    if self.model.hparams.dp:
        if self.model.hparams.dp_tool == "opacus":
            # NOTE: for now the adding to the optimizers is in model.configure_optimizers()
            # because at this point model.configure_optimizers() wasn't called yet. 
            # That's also why we save n_accumulation_steps as a model parameter.
            sample_rate = self.datamodule.batch_size/len(self.datamodule.dataset_train)
            if self.model.hparams.virtual_batch_size >= self.model.hparams.batch_size: 
                self.model.n_accumulation_steps = int(
                    self.model.hparams.virtual_batch_size/self.model.hparams.batch_size
                )
            else: 
                self.model.n_accumulation_steps = 1 # neutral
                print("Virtual batch size has to be bigger than real batch size!")

            # NOTE: For multiple GPU support: see PL code. 
            # For now we only consider shifting to cuda, if there's at least one GPU ('gpus' > 0)
            self.model.privacy_engine = PrivacyEngine(
                self.model.model,
                sample_rate=sample_rate * self.model.n_accumulation_steps,
                target_delta = self.model.hparams.target_delta,
                target_epsilon=self.model.hparams.target_epsilon,
                epochs=self.trainer.max_epochs,
                max_grad_norm=self.model.hparams.L2_clip,
            ).to("cuda:0" if self.trainer.gpus else "cpu")
            # necessary if noise_multiplier is dynamically calculated by opacus
            # in order to ensure that the param is tracked
            self.model.hparams.noise_multiplier = self.model.privacy_engine.noise_multiplier
            print(f"Noise Multiplier: {self.model.privacy_engine.noise_multiplier}")

        else: 
            print("Use either 'opacus' or 'deepee' as DP tool.")

        # self.fit_kwargs is passed to self.trainer.fit() in LightningCLI.fit(self)
        self.fit_kwargs.update({
            'model': self.model
        })

        # in addition to the params saved through the model, save some others from trainer
        important_keys_trainer = ['gpus', 'max_epochs', 'deterministic']
        self.trainer.logger.experiment.config.update(
            {
                important_key:self.config['trainer'][important_key] 
                for important_key in important_keys_trainer
            }
        )
        # the rest is stored as part of the SaveConfigCallbackWandB
        # (too big to store every metric as part of the above config)
    
    # track gradients, etc.
    self.trainer.logger.experiment.watch(self.model)

Then in my model:

class LitModelDP(LightningModule):
    def __init__(...):
      [...]
      # disable auto. backward to be able to add noise and track 
      # the global grad norm (also in the non-dp case, lightning 
      # only does per param grad tracking)
      self.automatic_optimization = False

     # manual training step, eval, etc.

     def configure_optimizers(self):
        optims = {}
        # DeePee: we want params from wrapped mdoel
        # self.paramters() -> self.model.wrapped_model.parameters()
        if self.hparams.optimizer=='sgd':
            optimizer = torch.optim.SGD(
                self.model.parameters(), 
                **self.hparams.opt_kwargs,
            )
        elif self.hparams.optimizer=='adam':
            optimizer = torch.optim.Adam(
                self.model.parameters(), 
                **self.hparams.opt_kwargs,
            )

        if self.hparams.dp_tool=='opacus' and self.hparams.dp:
            self.privacy_engine.attach(optimizer)

        optims.update({'optimizer': optimizer})
        [...]
        return optims

Calling the LightningCLI at the end

cli = LightningCLI_Custom(model_class=LitModelDP, datamodule_class=[...])

I’m very open to feedback and would also gladly help to integrate that into PL.

karthikprasad · October 5, 2021, 8:12pm

Hello @NiWaRe,
Apologies for the delay in response. Adding Lightning support to Opacus will certainly electrify it. Thank you so much for working on it.

What took us so long to respond you ask? Well, we have been working on a major refactor to Opacus (v1.0), and this would mean breaking API changes and code movement. Would you mind sending a PR on top of GitHub - pytorch/opacus at experimental_v1.0 branch?

NiWaRe · October 5, 2021, 8:52pm

Hi @karthikprasad,
great yes, I’ll take a look at the new version on the weekend!
Best,
Nicolas

Peter_Romov · November 9, 2021, 1:25am

Hi everyone!
The problem reported by @James_M is fixed in this pull request. You can now add PrivacyEngine to LightningModel (see the demo in examples/mnist_lightning.py).

As we are working on the new API in experimental_v1.0 branch, I’m exploring ways to implement tighter (ideally seamless) integration between PyTorch Lightning and Opacus. In this RFC I’ve shared some thoughts on how this integration can be implemented. Feedback and comments are much appreciated. Actually, I need help there.

@NiWaRe Many thanks for your prototype!
In LitModelDP we now can use automatic_optimization=True.
In my demos I simply put all the DP parameters in LightningModule.init so that we can enable DP and configure it via --model.xxx cmdline args. Extending LightningCLI is indeed a good next step.

NiWaRe · December 26, 2021, 9:07am

Hi @Peter_Romov and @karthikprasad,
I’m sorry I didn’t respond, I was extremely busy with my thesis, which is due in three weeks.
Thanks for implementing this! I’ll take a look latest in three weeks when I’m done with my project