Hi, I’m trying to implement a network, in which the weights of the layers should be calculated as a result of a tensor operation. This is the code I have:
class NOWANet(nn.Module): def __init__(self, V, Wstacked): super(NOWANet, self).__init__() # first conv layer # define layer self.conv1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5) # define tensors that will be used to calculate weights and biases self.conv1_W_V = nn.Parameter(V, requires_grad=True) self.conv1_B_V = nn.Parameter(V, requires_grad=True) self.conv1_W_stack = nn.Parameter(Wstacked, requires_grad=False) self.conv1_B_stack = nn.Parameter(Wstacked, requires_grad=False) # set layer weights and bias self.conv1.weight = nn.Parameter(torch.tensordot(self.conv1_W_V, self.conv1_W_stack, dims=1, out=None), requires_grad=True) self.conv1.bias = nn.Parameter(torch.tensordot(self.conv1_B_V, self.conv1_B_stack, dims=1, out=None), requires_grad=True)
The idea is to perform a tensordot between the V and W_stack tensors, which then should be used as weights and bias (one tensordot each). The catch is that I only want to optimize V so that W_stack remains the intact. The way it is currently written initializes the weights correctly, but backprop optimizes over the actual weights of the layer, there is no change to the V vector.
Any ideas or suggestions on how to do it?