These are the parameters of my Deep Learning model, to the right are their shapes. I want the learning rate of the parameters rho
in each layer to be 0.01
initially and 0.001
intially. How can I do that? I saw other forumns but most just tell about setting different learning rate for specific layers initially.
Here are the optimizer
and scheduler
I’m using
optimizer = optim.Adam(model.parameters(), lr = 0.001 , weight_decay = 0.0001)
scheduler = ReduceLROnPlateau(optimizer, mode = 'min', factor = 0.1 , patience = 5)
ResNet_arch.conv0.weight torch.Size([16, 3, 3, 3])
ResNet_arch.norm0.rho torch.Size([1, 16, 1, 1])
ResNet_arch.norm0.weight torch.Size([1, 16, 1, 1])
ResNet_arch.norm0.bias torch.Size([1, 16, 1, 1])
ResNet_arch.block1.s_layers.conv1.weight torch.Size([16, 16, 3, 3])
ResNet_arch.block1.s_layers.norm1.rho torch.Size([1, 16, 1, 1])
ResNet_arch.block1.s_layers.norm1.weight torch.Size([1, 16, 1, 1])
ResNet_arch.block1.s_layers.norm1.bias torch.Size([1, 16, 1, 1])
ResNet_arch.block1.s_layers.conv2.weight torch.Size([16, 16, 3, 3])
ResNet_arch.block1.s_layers.norm2.rho torch.Size([1, 16, 1, 1])
ResNet_arch.block1.s_layers.norm2.weight torch.Size([1, 16, 1, 1])
ResNet_arch.block1.s_layers.norm2.bias torch.Size([1, 16, 1, 1])
ResNet_arch.block2.s_layers.conv1.weight torch.Size([16, 16, 3, 3])
ResNet_arch.block2.s_layers.norm1.rho torch.Size([1, 16, 1, 1])
ResNet_arch.block2.s_layers.norm1.weight torch.Size([1, 16, 1, 1])
ResNet_arch.block2.s_layers.norm1.bias torch.Size([1, 16, 1, 1])
ResNet_arch.block2.s_layers.conv2.weight torch.Size([16, 16, 3, 3])
ResNet_arch.block2.s_layers.norm2.rho torch.Size([1, 16, 1, 1])
ResNet_arch.block2.s_layers.norm2.weight torch.Size([1, 16, 1, 1])
ResNet_arch.block2.s_layers.norm2.bias torch.Size([1, 16, 1, 1])
ResNet_arch.block3.d_layers.conv1.weight torch.Size([32, 16, 3, 3])
ResNet_arch.block3.d_layers.norm1.rho torch.Size([1, 32, 1, 1])
ResNet_arch.block3.d_layers.norm1.weight torch.Size([1, 32, 1, 1])
ResNet_arch.block3.d_layers.norm1.bias torch.Size([1, 32, 1, 1])
ResNet_arch.block3.d_layers.conv2.weight torch.Size([32, 32, 3, 3])
ResNet_arch.block3.d_layers.norm2.rho torch.Size([1, 32, 1, 1])
ResNet_arch.block3.d_layers.norm2.weight torch.Size([1, 32, 1, 1])
ResNet_arch.block3.d_layers.norm2.bias torch.Size([1, 32, 1, 1])
ResNet_arch.block3.d_downsample.convd.weight torch.Size([32, 16, 1, 1])
ResNet_arch.block3.d_downsample.normd.rho torch.Size([1, 32, 1, 1])
ResNet_arch.block3.d_downsample.normd.weight torch.Size([1, 32, 1, 1])
ResNet_arch.block3.d_downsample.normd.bias torch.Size([1, 32, 1, 1])
ResNet_arch.block4.s_layers.conv1.weight torch.Size([32, 32, 3, 3])
ResNet_arch.block4.s_layers.norm1.rho torch.Size([1, 32, 1, 1])
ResNet_arch.block4.s_layers.norm1.weight torch.Size([1, 32, 1, 1])
ResNet_arch.block4.s_layers.norm1.bias torch.Size([1, 32, 1, 1])
ResNet_arch.block4.s_layers.conv2.weight torch.Size([32, 32, 3, 3])
ResNet_arch.block4.s_layers.norm2.rho torch.Size([1, 32, 1, 1])
ResNet_arch.block4.s_layers.norm2.weight torch.Size([1, 32, 1, 1])
ResNet_arch.block4.s_layers.norm2.bias torch.Size([1, 32, 1, 1])
ResNet_arch.block5.d_layers.conv1.weight torch.Size([64, 32, 3, 3])
ResNet_arch.block5.d_layers.norm1.rho torch.Size([1, 64, 1, 1])
ResNet_arch.block5.d_layers.norm1.weight torch.Size([1, 64, 1, 1])
ResNet_arch.block5.d_layers.norm1.bias torch.Size([1, 64, 1, 1])
ResNet_arch.block5.d_layers.conv2.weight torch.Size([64, 64, 3, 3])
ResNet_arch.block5.d_layers.norm2.rho torch.Size([1, 64, 1, 1])
ResNet_arch.block5.d_layers.norm2.weight torch.Size([1, 64, 1, 1])
ResNet_arch.block5.d_layers.norm2.bias torch.Size([1, 64, 1, 1])
ResNet_arch.block5.d_downsample.convd.weight torch.Size([64, 32, 1, 1])
ResNet_arch.block5.d_downsample.normd.rho torch.Size([1, 64, 1, 1])
ResNet_arch.block5.d_downsample.normd.weight torch.Size([1, 64, 1, 1])
ResNet_arch.block5.d_downsample.normd.bias torch.Size([1, 64, 1, 1])
ResNet_arch.block6.s_layers.conv1.weight torch.Size([64, 64, 3, 3])
ResNet_arch.block6.s_layers.norm1.rho torch.Size([1, 64, 1, 1])
ResNet_arch.block6.s_layers.norm1.weight torch.Size([1, 64, 1, 1])
ResNet_arch.block6.s_layers.norm1.bias torch.Size([1, 64, 1, 1])
ResNet_arch.block6.s_layers.conv2.weight torch.Size([64, 64, 3, 3])
ResNet_arch.block6.s_layers.norm2.rho torch.Size([1, 64, 1, 1])
ResNet_arch.block6.s_layers.norm2.weight torch.Size([1, 64, 1, 1])
ResNet_arch.block6.s_layers.norm2.bias torch.Size([1, 64, 1, 1])
ResNet_arch.fc__.weight torch.Size([25, 64])
ResNet_arch.fc__.bias torch.Size([25])