Hi,
I’m trying to add some in-place operations inside the forward function but it seems like these operations slows down the inference process.
My forward function is like this:
def forward(self, x):
x = x.permute(0, 3, 1, 2)
x = F.interpolate(x, [224, 224], mode="bilinear")
x[:, 0, :, :] -= 149.89 # inplace operation
x[:, 0, :, :] /= 37.35 # inplace operation
x[:, 1, :, :] -= 113.11 # inplace operation
x[:, 1, :, :] /= 37.16 # inplace operation
x[:, 2, :, :] -= 130.63 # inplace operation
x[:, 2, :, :] /= 37.48 # inplace operation
x = self.model(x)
return nn.Softmax(dim=1)(x)
As you can see, I add some inplace operations onto BGR channel. In this case, if I do inference on a 512~batch input, it takes 0.83s on GPU.
However, If I get rid of these inplace operations, the code would be:
def forward(self, x):
x = x.permute(0, 3, 1, 2)
x = F.interpolate(x, [224, 224], mode="bilinear")
x = self.model(x)
return nn.Softmax(dim=1)(x)
And it only takes 0.23s (almost 1/4 of the former one) to do the inference on the same 512 batch.
Is it normal? I can’t believe the inplace operation takes so much time. Any ideas to optimize the code? Thanks a lot.