Hello,
I have a large neural network containing 100 layers. In my net, I used a pre-trained network in object detection task called YOLO. I have some extra layers added to it to do some extra task. I would like to freeze the actual YOLO weight and just learn the extra layers’ parameters. From some discussion topics here I’ve written code as bellow:
def release_weight(yolo):
for param in yolo.parameters():
param.requires_grad = True
return yolo
def freeze_weight(yolo):
for param in yolo.stage1.parameters():
param.requires_grad = False
for param in yolo.stage4.parameters():
param.requires_grad = False
for param in yolo.stage5.parameters():
param.requires_grad = False
for param in yolo.stage5_1.parameters():
param.requires_grad = False
for param in yolo.parallel1.parameters():
param.requires_grad = False
for param in yolo.parallel2.parameters():
param.requires_grad = False
for param in yolo.stage7.parameters():
param.requires_grad = False
for param in yolo.final.parameters():
param.requires_grad = False
for param in yolo.d.parameters():
param.requires_grad = False
return yolo
yolo = YoloV2(model_path, 425)
yolo = release_weight(yolo)
yolo = freeze_weight(yolo)
Then I selected the parameters with requires_gard = True flag. Here is my code:
parameters = itertools.filterfalse(lambda p: not p.requires_grad, yolo.parameters())
optimizer = optim.SGD(parameters, lr=baseLR, weight_decay=0.001, momentum=0.9)
And then I have checked to determine whether the names of the parameters selected by above code are correct or not and I was assured by bellow code that the selection was correct.
print("Names of training parameters")
for name, param in yolo.named_parameters():
if param.requires_grad:
print(name)
But when I started training, I somehow found that the weights of pre-trained were changed. Because after running some epochs, I have compared the output of frozen parts.
So could you please help me what is the source of the problem? Because I actually determine which parameters should be trained and which ones should not!