class modelA(Module):
def __init__(self):
super().__init__()
self.base= ...
self.headA= ...
def forward(self,input):
x= self.base(input)
outA= self.headA(x)
return outA
class modelB(Module):
def __init__(self):
super().__init__()
self.base=...
self.headB= ...
def forward(self,input):
x= self.base(input)
outB= self.headB(x)
return outB
this model was supposed to work in way that weight of base layer should be shared, so as per suggestion here by @apaszke I defined a combined model as follow
class model_combined(Module):
def __init__(self):
super().__init__()
self.base=...
self.headA= ...
self.headB= ...
def forward(self,input):
x= self.base(input) # this need to be freezed
outA= self.headA(x)
y=self.base(input) #this neeeds to backpropagte
outB= self.headB(y)
return outA, outB
in training loop I am calculating loss as
total_loss= loss(outA)+ loss(outB)
total_loss.backward()
but after some epoch, I need to freeze modelA (i.e self.headA in model_combined)
My questions are:
- since
base
is common layer here, so how to freezeheadA
(weight of headA should not change) such thatbase
is updated due to backpropagation ofloss(outB)
but not due toloss(outA)
- will it be a bad practice if instead of freezing I just don’t backpropagate loss through
headA
i.e changetotal_loss= loss(outB)