Hi, I’m trying to implement a network, in which the weights of the layers should be calculated as a result of a tensor operation. This is the code I have:
class NOWANet(nn.Module):
def __init__(self, V, Wstacked):
super(NOWANet, self).__init__()
# first conv layer
# define layer
self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5)
# define tensors that will be used to calculate weights and biases
self.conv1_W_V = nn.Parameter(V[0], requires_grad=True)
self.conv1_B_V = nn.Parameter(V[1], requires_grad=True)
self.conv1_W_stack = nn.Parameter(Wstacked[0], requires_grad=False)
self.conv1_B_stack = nn.Parameter(Wstacked[1], requires_grad=False)
# set layer weights and bias
self.conv1.weight = nn.Parameter(torch.tensordot(self.conv1_W_V,
self.conv1_W_stack,
dims=1, out=None),
requires_grad=True)
self.conv1.bias = nn.Parameter(torch.tensordot(self.conv1_B_V,
self.conv1_B_stack,
dims=1, out=None),
requires_grad=True)
The idea is to perform a tensordot between the V and W_stack tensors, which then should be used as weights and bias (one tensordot each). The catch is that I only want to optimize V so that W_stack remains the intact. The way it is currently written initializes the weights correctly, but backprop optimizes over the actual weights of the layer, there is no change to the V vector.
Any ideas or suggestions on how to do it?