I have a special model where, during the first iterate of a video, one of the components doesnt get used for example:
if self.iter != 0:
stuff = self.modelA(input)
stuff2 = self.modelB(stuff)
return stuff2
else:
stuff = self.modelA(input)
return stuff
One of the GPU’s / process might hit self.iter = 0, while another GPU may still be running stuff with self.iter = N. When this happens, the model hangs on any batch norm.
How can I get around this?