Nn.functional.interpolate is really slow

m732367606 · May 16, 2019, 6:52am

It seems nn.functional.interpolate is working on CPU instead of GPU.
The CPU occupancy is 100%, however the GPU occupancy is merely 11%.
Here are the parts my model forward codes, I wrote the nn.functional.interpolate into it.
Can the nn.functional.interpolate function be ran by cuda?

    def forward(self, x):
        x = nn.functional.interpolate(x, scale_factor=self.scale, mode='bicubic',align_corners=True)
        return x

m732367606 · May 16, 2019, 9:10am

m732367606 · May 17, 2019, 2:25am

Did somebody also encounter this problem?

ptrblck · May 17, 2019, 11:01am

I cannot reproduce this issue.
Are you sure F.interpolate is using the CPUs?
This code snippet utilizes the GPU 100% and a single CPU code, which seems to be fine:

import torch
import torch.nn as nn
import torch.nn.functional as F

import time


class MyModel(nn.Module):
    def __init__(self, scale_factor=2):
        super(MyModel, self).__init__()
        self.scale_factor = scale_factor
        
    def forward(self, x):
        x = F.interpolate(x, scale_factor=self.scale_factor, mode='bicubic',align_corners=True)
        return x


model = MyModel()

b, c, h, w = 64, 3, 128, 128
x = torch.randn(b, c, h, w, device='cuda')

nb_iters = 100000

torch.cuda.synchronize()
t0 = time.perf_counter()

for _ in range(nb_iters):
    output = model(x)
    
torch.cuda.synchronize()
t1 = time.perf_counter()

print('Took {}s, {}s per iter'.format((t1 - t0), (t1 - t0)/nb_iters))

Could you try this code?

m732367606 · May 17, 2019, 12:32pm

Dear ptrblck. It was found that high CPU occupancy was caused by slow dataloader during training data preparing process instead of nn.functional.interpolate.
It was all my fault.
Thank you for your help sincerely!