Why does my relu activation function has requires_grad = True? What happens if I set it to False?
My model:
ConvAE(
(encoder): Sequential(
(0): Conv1d(1, 32, kernel_size=(220,), stride=(2,))
(1): ReLU()
(2): Conv1d(32, 21, kernel_size=(88,), stride=(2,))
(3): ReLU()
(4): Conv1d(21, 11, kernel_size=(35,), stride=(2,))
(5): ReLU()
(6): Conv1d(11, 1, kernel_size=(14,), stride=(2,))
(7): ReLU()
)
(decoder): Sequential(
(0): ConvTranspose1d(1, 11, kernel_size=(14,), stride=(2,))
(1): ReLU()
(2): ConvTranspose1d(11, 21, kernel_size=(35,), stride=(2,))
(3): ReLU()
(4): ConvTranspose1d(21, 32, kernel_size=(88,), stride=(2,), output_padding=(1,))
(5): ReLU()
(6): ConvTranspose1d(32, 1, kernel_size=(220,), stride=(2,), output_padding=(1,))
)
)
When I run the code:
for param in model.encoder.parameters():
print(param.shape)
print(param.requires_grad)
the output is:
torch.Size([32, 1, 220])
True
torch.Size([32])
True
torch.Size([21, 32, 88])
True
torch.Size([21])
True
torch.Size([11, 21, 35])
True
torch.Size([11])
True
torch.Size([1, 11, 14])
True
torch.Size([1])
True
I guess the torch.Size([“only one value”]) are my relu activation functions for the filters. But why do they have require_grad=True? And what happens if I set requires_grad=false for them? Will the training happen in the same way? Or will this make the gradient chain break somehow?
The idea is, that I want to train single conv layers / kernel why keeping all other layers fixed.