And torch/autograd seems to know how to build the backprop graph in order to train this network.
However, if I define my operations in a for loop, rather than linearly, such as:
def forward(self, input):
embedded_input = None
for i, network in embedding_networks:
out = embedding_networks(input[i])
if embedded_input is None:
embedded_input = out
else:
embedded_input = torch.cat((embedded_input, out),1)
output = net(input)
The forward/backwards prop work, but when I look at the parameter of my network (using .parameters() and iterating through their .shap) it seems that the parameters only include the final net object and not all the objects in the embedding_networks list through which I first pass my input.
Is this to be expected ? Is there something obviously wrong with the second snippet compared to the next ? How would I best achieve something like what’s shown in the second snippet ?
It seems to work once I made that change, still need to validate that it’s actually training (as in, that the backwards prop graph has the correct linking inside of it), but barring that, just by printing the parameters it looks fine.
Python loops are slow, u can write it in C++ instead. It is better to use Pytorch methods to replace loops in your code. Another way to speed things up is to use TouchScript(jit).
@G.M The problem with writing in C++ is that I have to distribute this as a library, so binary blobs might cause their own issues, especially since I distribute it as setup on machine, rather than as a wheel.
Would you mind explaining how to replace this kind of loop with a Pytroch method ?
Didn’t u already have a non-loop version in your first post?
Sadly, not all loops can be replaced by Pytorch methods. In this case, the loops is fine because u r only iterating through a list of models, not using iteration inside the algorithm.
Also, if ur embedding_networks is a sequence of models, I guess u should change this line
That was a mistake when re-writing the code, it did originally use enumerate.
Sadly enough the number of networks is indeterminate at code-writing time. I was assuming pytroch had some alterantive to for loops. There to seem to be some functionalities for distributing a list of processes between GPUs, so I might end up using that.