We’re playing around with some custom neural nets that include operations in the forward
such as building a matrix and multiplying the input, e.g.
def forward(self, x):
output = self.connection_1(x)
output = self.activation_1(output)
output = self.connection_2(output)
output = self.activation_2(output)
output = self.connection_3(output)
output = self.activation_3(output)
M = self.create_cholesky(output)
# * create symmetric matrix A out of predicted
# * Cholesky decomposition
A = M.bmm(M.transpose(1, 2))
return A
So far, the module outputs the matrix, and then we use this outside to compute the actual value of interest, y = x.T A x
.
It seems sensible to me that this final step should also be in the forward()
, so that the final output is already y
, which would also make comparing it to vanilla MLP much more straightforward.
The catch is that we also want access to the matrix A
(among other things to do custom losses), and it’s not clear to me how best to approach this.
- We could simply assign
self.A = A
so that it’s accessible, but I wonder if that might invite problems with autograd. - We could have a separate method to output only
A
and wrap that intorch.no_grad()
, but that seems wasteful. - We could have
forward()
return bothy
andA
, but that changes the how we interface with a model, which is a bit gross. - We could use a forward hook. We haven’t used this before, and it seems like this might be meant only for debugging (?), but it seems like the correct choice?