AttributeError: 'GradSampleModule' object has no attribute for method

I am using the flower as an FL framework and I am trying to put DP support by using Opacus.
Here is the problem I meet:

  1. I am using a very common MNIST model and inherited to get a new class:
class Net(nn.Module):
    def __init__(self) -> None:
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3, 1)
        self.conv2 = nn.Conv2d(32, 64, 3, 1)
        self.dropout1 = nn.Dropout(0.25)
        self.dropout2 = nn.Dropout(0.5)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x) -> torch.Tensor:
        x = self.conv1(x)
        x = F.relu(x)
        x = self.conv2(x)
        x = F.relu(x)
        x = F.max_pool2d(x, 2)
        x = self.dropout1(x)
        x = torch.flatten(x, 1)
        x = self.fc1(x)
        x = F.relu(x)
        x = self.dropout2(x)
        x = self.fc2(x)
        output = F.log_softmax(x, dim=1)
        return output

class MNISTNet(Net):
    """inheritance from Net (MNIST Model) """
    def get_weights(self): #-> List[np.ndarray]:
        """Get model weights as a list of NumPy ndarrays."""
        return [val.cpu().numpy() for _, val in self.state_dict().items()]

    def set_weights(self, weights: List[np.ndarray]) -> None:
        """Set model weights from a list of NumPy ndarrays.
        weights: fl.common.Weights
            Weights received by the server and set to local model
        state_dict = OrderedDict(
                k: torch.Tensor(v)
                for k, v in zip(self.state_dict().keys(), weights)
        self.load_state_dict(state_dict, strict=True)

Then I simply put the DP protection before training:

        model, optimz, train_loader = privacy_engine.make_private(
            module       = model,
            optimizer    = optimz,
            data_loader  = train_loader,
            noise_multiplier = 1.0,
            max_grad_norm = 1.0,

         <... training ...>

Then an error report is as below:

  File "/home/xeniro/prj/flower_framework/client_zone/", line 194, in fit
  File "/home/xeniro/miniconda3/envs/xflwr/lib/python3.10/site-packages/opacus/grad_sample/", line 140, in __getattr__
    raise e
  File "/home/xeniro/miniconda3/envs/xflwr/lib/python3.10/site-packages/opacus/grad_sample/", line 135, in __getattr__
    return super().__getattr__(item)
  File "/home/xeniro/miniconda3/envs/xflwr/lib/python3.10/site-packages/torch/nn/modules/", line 1185, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'GradSampleModule' object has no attribute 'set_weights'

Please let me know if there are some available solution.

1 Like

I’m not familiar with the libs you are using and what privacy_engine.make_private does, but it seems the error is raised in self.model.set_weights(parameters) after manipulating the model in make_private.
Would it be possible to call set_weights before or alternatively, is model now containing new attributes (e.g. was the original model moved to e.g. model.module)?

@ptrblck Thanks a lot for the quick response.
Yes, if I move the “set_weights” before the “privacy_engine.make_private” then I can move forward (though some other issue, but not likely relevant to this).
But why is that? set_weights only does a “parameters filling” job. I think this needs the Opacus team’s help.
I find this post but I am not sure if that can explain my problem (“Opacus does not yet support advanced computation graph manipulations (such as torch.autograd.grad()”)
as I believe the above code does not involve computation graph manipulations? or does it?

BTW, I find after making “privacy_engine.make_private” some attributes would be removed, for exp,

  • before that I can obtain the dataloader batch size by calling dataloader.batch_size.
  • after that dataloader.batch_size return None.

@alexandresablayrolle hope he can give a help.

I guess model.set_weights() is supposed to manipulate internal parameters and the model returned by make_private is indeed not a plain nn.Module anymore (with all your function definitions) but is a GradSampleModule object now providing these methods. This would be the reason why your custom set_weights function is not available anymore.

Then that is a problem, even if I can pass the 1st iteration (by put my custom set_weights function ahead of the opacus function), as FL request a multiple round training, as soon as I get the aggregation parameter from the server, I still need to call set_weights and the fail will happen at the 2nd round training.

Some more thoughts:
Why does Opacus privacy engine, need to manipulate the model? I find some older version example code, the privacy engine will only deal with the optimizer (i.e. dp-sgd).

I don’t know as I’m not familiar enough with Opacus and was only reading a bit through the code, so you would have to wait for an Opacus expert to chime in. :wink:

@Leonmac The problem here is that privacy_engine.make_private wraps your model object with GradSampleModule(model). The latter is an instance of nn.Module which can do forward/backward passes. The difference from the original model is that 1) it computes per-sample gradients (this is key for dp-sgd) 2) it doesn’t inherit the custom methods you implemented in the original module.

The optimal way is to load weights before turning the model into private. If you set the weights before calling make_private it will work.

You can actually access your module via private field like self.model._module.set_weights(..). This is a dangerous way as it may brake privacy accounting and DP-SGD itself, but it may help if you understand what you are doing.

@pstock maybe you know what’s the best way to implement FL with Opacus?

@Peter_Romov Thanks a lot for the explain.
I had changed my design that move the set_weight() out of the Model class and but wrapping the model outside of the training loop, that way the problem is solved.
I still have some other questions regarding opacus:

  1. I find the dataloader output from “privacy_engine.make_private" will have the attribute like dataloader.batch_size out of work (return None)-- this is not a bid deal and a workaround is available but is this a correct behaviour?
    –does the make_private modify (add noise?) to Dataloader as well?

  2. A FL training is different from a local Centralized Training (the training happens on client’s side, and a model aggregation happen on the server side), in such case, do I NEED to call the make_private each time when I get the updated aggregated model from server side?

  3. In my test, I see such log many times:

07/28/2022 14:41:49:INFO:Despite set_to_none is set to False, opacus will set p.grad_sample and p.summed_grad to None due to non-trivial gradient accumulation behaviour

I checked out this is relevant to DPOptimizer, zero_grad(set_to_none=False )[source]
I am not sure the meaning of these infor, doe that matter? and can I simply disable these INFO logging showing?

Hi @Leonmac,
Apologies for the delay in response. Since this question was marked as resolved, this escaped our radar.

  1. I find the dataloader output from “privacy_engine.make_private" will have the attribute like dataloader.batch_size out of work (return None)-- this is not a bid deal and a workaround is available but is this a correct behaviour?
    –does the make_private modify (add noise?) to Dataloader as well?

make_private does modify DataLoader and returns DPDataLoader (similar to modifying Module → GradSampleModule). The main change here is wrt batch sampler to enable Poisson sampling, which leads to variable batch_size. More on this can be found in the documentation: Opacus · Train PyTorch models with Differential Privacy

do I NEED to call the make_private each time when I get the updated aggregated model from server side?

I am not familiar with Flower, but I suspect you might need to make some tweaks. Opacus’s make_private creates a model that takes care of per-sample gradient computation and noise-addition during SGD. In the FL setting, you don’t need per-sample gradients to preserve user-level privacy (as discussed in [1710.06963] Learning Differentially Private Recurrent Language Models). Rather, you each client update itself is a gradient and you can simply aggregate them all, add noise, and update the central model.
A sample implementation of this can be found at FLSim/ at main · facebookresearch/FLSim · GitHub. Note that this doesn’t use opacus to add noise, or to wrap the model, but still achieves DP.

In my test, I see such log many times

As discussed in the previous answer, Opacus needs to compute per-sample gradients to clip them before adding noise. Internally this is done via an additional .grad_sample attribute, which is then used to update .grad. With regular Pytorch, set_to_none=False will not update .grad (this is true even in Opacus), BUT Opacus has to always clear out .grad_sample. Not doing it will mess with gradient accumulation.
Unless your implementation depends on grad not being None, you can ignore this warning.

1 Like

@karthikprasad Greate thanks so much. I still have some other questions but maybe better put them into a different post.