Hi,
I am trying to implement the network structure in this paper https://arxiv.org/abs/1605.09673 with PyTorch and have run into some problems.
May I ask how can I replace the weight of the backbone network with the output of another subnetwork? More importantly, how can I pass the gradient of weight to the output of subnetwork?
This is what I have done now:
class MyNet(nn.Module):
def __init__(self):
# some other stuff
self.backbone = nn.Conv2d(
# some parameters
)
self.weight_generator = nn.Sequential(
nn.Linear(4096,576),
nn.Tanh()
)
def forward(self, x):
# some other stuff
weight = self.weight_generator(x).view(50,64,3,3)
self.backbone.weight.data = nn.Parameter(weight, requires_grad=True)
out = self.backbone(x)
return out
Current problem is that when I print(list(myNetwork.weight_generator.children())[0].weight.grad)
, it always output None
. Can anyone help me with it?
Thanks in advance for any help!