Has anybody observed that interpolate is slow in certain cases?

I find that in the mode of amp.autocast(enabled=True), and scale_factor is large(for example scale_factor=16 and align_corners=False and mode=bilinear, the training process is much longer. It seems that the following operator makes training slow greatly. Has anyone observed this and tried to fix it?

model = nn.Upsample(scale_factor=16, mode='bilinear', align_corners=False)
with amp.autocast(enabled=True):
    out = model(inp)
    ....
  

Which PyTorch release are you using and could you post the profiling code as well as the results, please?