When I used a numpy-based upsampling function in the network, the training didn't proceed properly

ymin2570 · May 7, 2024, 5:51am

I want to use a numpy-based upsampling function in training a deep learning network.

I want to use the loss function between the upsampling result and the GT to train the network to preprocess the image before upsampling.

def forward(self, x):
    x = self.head(x)

    res = self.body(x)
    res += x

    x = self.tail(res)
    images = []
    for batch in range(x.shape[0]):
        single_img = x[batch, :, :, :].squeeze(0)
        np_single = tensor_to_numpy(single_img.cpu().detach())
        upsampled = numpy_based_upsample(np_single)
        up_tensor = numpy_to_tensor(upsampled).unsqueeze(0)
        images.append(up_tensor)
    out = torch.cat(images, dim=0).cuda()
    return x, out

I wrote the code as described above, and I encountered error message
“RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn”
So, I changed the code as follow
out = torch.cat(images, dim=0).cuda().requires_grad_(True), but upon inspecting the results, it seems that the network hasn’t learned properly. In such a case, how should I resolve this problem?

(The reason why I wrote the code as above is that the function based on numpy is for images of (H, W, C), But the tensor image of the network is the shape of (B, C, H, W).)

AlphaBetaGamma96 · May 7, 2024, 9:06am

Hi @ymin2570,

When you detach a tensor and pass it through a numpy function, you delete the gradient history the operation (which sets its requires_grad flag to False. When you re-cast it to a PyTorch tensor, it will only have the history from the re-casting to the computing of the loss.

This is why you get,

When you manually add the requires_grad_(True) to be True, you’re not solving the problem but hiding it, because autograd will track gradients up to,

but it can’t track the gradient through the numpy function.

In order to solve this solution, you’ll need to use a pytorch function (which will track the gradient), or you will need to create your own torch.autograd.Function with its own forward and backward calculations defined manually (assuming you can define the gradient of an up-sampling operation).

An example torch.autograd.Function can be found in the documentation here: PyTorch: Defining New autograd Functions — PyTorch Tutorials 2.3.0+cu121 documentation