Hello,
I have a traced model that runs very slowly on the second inference run on torch 1.7 and greater. From these posts and others, this seems to be the expected behavior.
I have one question about the exact speed of the jit/graph optimisation run (the second run). Say that my model can take in different sized tensors as data during the forward pass, say [batch_size, c, h, w] or and then [batch_size, c * 10, h, w] both being valid inputs. Would be it expected that two forward passes (including the second, longer one) would be faster with input [batch_size, c, h, w] than input sized [batch_size, c * 10, h, w]? My experiments show that the second run (that optimises the graph) is the same (slower) speed with both inputs. I just want to confirm that this is the expected behavior.
Thank you!