I tried the tutorial notebook on Text Classification. It works well. However, I don’t understand if I don’t freeze any layer, there will be a problem in training step. More specifically:
/usr/local/lib/python3.7/dist-packages/opacus/optimizers/optimizer.py in clip_and_accumulate(self)
397 g.view(len(g), -1).norm(2, dim=-1) for g in self.grad_samples
398 ]
--> 399 per_sample_norms = torch.stack(per_param_norms, dim=1).norm(2, dim=1)
400 per_sample_clip_factor = (self.max_grad_norm / (per_sample_norms + 1e-6)).clamp(
401 max=1.0
RuntimeError: stack expects each tensor to be equal size, but got [8] at entry 0 and [1] at entry
Any idea ? Thanks
Hello @long21wt
Thank you for reporting this. This is likely a bug in our tutorial. Do you mind sending us your full stack error, along with our template Colab and post here the link?
Please paste your colab link here. Remember: SET IT TO PUBLIC
Thank you. Here is the link:
As far as I know, it seems like you would need to modify forwarding method of BERT ( lxuechen/private-transformers: make differentially private training of transformers easy (github.com))
And Roberta works out of the box with opacus in other experiments.
Thanks for creating this. We are looking into this!
After a while, I’m back to this issue. By printing the model’s parameters:
for n, p in model.named_parameters():
print("{:50s} {}".format(n, list(p.grad_sample.shape) if hasattr(p, "grad_sample") else None))
I found the position_embeddings
cause the problem to the optimizer, do you have any idea to fix this ?
_module.bert.embeddings.word_embeddings.weight [7, 28996, 768]
_module.bert.embeddings.position_embeddings.weight [1, 512, 768]
_module.bert.embeddings.token_type_embeddings.weight [7, 2, 768]
_module.bert.embeddings.LayerNorm.weight [7, 768]
_module.bert.embeddings.LayerNorm.bias [7, 768]
I will try to take a look. In the meantime, I believe that functorch can alleviate the issue because it computes per-sample gradients in a different way (using the “no_op” version of the grad sample module, see e.g. https://github.com/pytorch/opacus/blob/main/examples/cifar10.py)