Nn.functional.interpolate does not seem to work as it meant? or is it?

I’m training some convolution-based networks with several datasets I’ve built(the project is open to public btw).

And this few lines in my script keep occurring RuntimeError: dimension does not match

Which can be simplified as below(to save time to browse).

# forward net
output = net(input_data) # tensor size : (4,4,256,256)

# Build target heatmap from pose labels
target = nn.functional.interpolate(target, (output.size()[2], output.size()[3]), mode="nearest") # tensor size has to be : (4,4,256,256) yet it have been (4,3,256,256)

loss = fn_pose(output, target)

The loss function squeezes the tensor to the size of (N,C,H*W)
N : batch_size, H : height, W : Width

Then last dimensions of each tensor become output->(65536), target->(49152)
65536 is the square number of 256, and 49152 is 256 times 192(3/4 of 256)

How come this happen? I have been worked on this problem few weeks.

For anybody who wants to check the repo : bigbreadguy/pose-estimator-test-loop: pytorch implementation of pose estimators for testing along datasets in the project (github.com)

I tied the fallowing code in torch1.9 and I can see no error

output = torch.randn(4,4,256,256) # tensor size : (4,4,256,256)

target = torch.randn(4,4,256,192)

target = nn.functional.interpolate(target, (output.size()[2], output.size()[3]), mode="nearest")


Output>>> torch.Size([4, 4, 256, 256])

And I don’t know how you expect that target size has to be a tensor with 4 channels.
You can see in dataset definition that hm_shape has only 3 channels.

1 Like

First of all, thanks for helping me.

I forgot to update the thread after I’ve solved the problem. Please forgive me. The problem was not because of nn.funtional.interpolate, it was just typical torchvision.transform mistakes.

I really appreciate you!