I have a somewhat weird network architecture, where I use two networks (one output is the second weights). I can run it and do
optimizer.step() without any errors. But when I look at the first network parameters between batches they don’t change.
I have absolutely no idea what I’m doing wrong and would like to just look at the backward’s computation graph and see where it stops. I’ve tried
pytorchviz but from what I understood it is no longer maintained.
Does anyone know how I can debug this problem?