I am very new to Pytorch and am undergoing my first major model construction project… As such, I am a bit lost and could use some guidance.
The model is as follows: We will have an LSTM network for each feature, and each of these networks will predict two things: The next feature at time t+1 and the target at time t+1. However, the target at time t+1 will be a shared prediction between all networks, whereas the feature prediction will be specific to that network. I have included a drawing to illustrate this. https://imgur.com/a/yMSp7 . Naturally following, the loss function will be a function of both predicted values, but I am hoping autograd can take care of this as if I just sum up all the appropriate parts it will change what is relevant through the computation graph.
This may be a really simple problem and a trivial question, but I haven’t seen an example of what I’m trying to do nor have I seen an easy way to connect multiple modules together. My intuition tells me that the x_n,t+1 predictions should be linear layers tacked on to the end of each specific module, and that the target prediction node should also be a linear layer but somehow connected to each module.
Thanks for any and all help, and apologies if this is question is too simple or over asked.