This is the key part in configure_optimizers() function in pytorch lightning:
params = 
train_names = 
print("Training only unet attention layers")
for name, module in self.model.diffusion_model.named_modules():
if isinstance(module, CrossAttention) and name.endswith('attn2'):
# Set requires_grad=False for all other parameters
for param in module.parameters():
param.requires_grad = False
opt = torch.optim.AdamW(params, lr=lr)
Guessing without a repro but can you try explicitly setting
param.requires_grad = True in your if condition
Thank you for replying.
What I want to do here is only setting attention parameters requires_grad = True and passing them to the optimizer.
It did work if I set all the parameters with requires_grad = True.
I just found out that this error seems to be caused by the custom-defined class CheckpointFunction(torch.autograd.Function)
and I just made the code work by simply avoiding using this class
Hi, Yuanzhi, I met the same problem when I set the ‘attention layer’ trainable only. Could you clarify how to change the CheckpointFunction(torch.autograd.Function)?
I also encountered the same situation when I debugged the code. Did you solve this problem? I look forward to receiving your reply