Warping images using disparity maps for stereo matching

I am trying to warp an image using a disparity map. The original image size is (375, 1242) and same is the size for disparity map. Both, the image and disparity map, are grayscale images. Below is the code I tried to execute

import numpy as np
import torch
import torch.nn.functional as F

# Load the image (img) to be warped using PIL library
# Load the disparity map image (disp) using PIL library

disp = np.array(disp, dtype=np.float32) / 256.

x = np.array(img)
img = torch.Tensor(x)
img.resize_(1,3,375,1242)
batch_size,_,height, width = img.size()

# Original coordinates of pixels
x_base = torch.linspace(0, 1, width).repeat(batch_size, height, 1).type_as(img)
y_base = torch.linspace(0, 1, height).repeat(batch_size, width, 1).transpose(1, 2).type_as(img)

disp = torch.Tensor(disp)
disp.resize_(1,375,1242)

flow_field = torch.stack((x_base + disp, y_base), dim=3)

# In grid_sample coordinates are assumed to be between -1 and 1
output = F.grid_sample(img, 2*flow_field - 1, mode='bilinear', padding_mode='zeros')

The image is resized to (1, 3, 375, 1242) where 1 indicates the batch_size (since only one image is to be passed) and 3 indicates the number of channels.
The code is taken and modified from this source - MonoDepth-PyTorch/loss.py at 0b7d60bd1dab0e8b6a7a1bab9c0eb68ebda51c5c · OniroAI/MonoDepth-PyTorch · GitHub

However, I dont quite understand the working of flow_field variable and grid_sample() function.
The output image, after converting the ‘output’ variable from a tensor to PIL supported dataframe, is completely black i.e all the values of the output tensor turn out to be 0.

Can someone help in rectifying the above code ?

1 Like

flow field is the indices of target tensor element from source tensor. for example, if flow_field[0,1,:]=[2, 3], it means the tensor output[0,1]=img[2,3], if the indices is not integer, this function will use interpolation to get the value, the interpolation mode you can set by the function input param mode, if the indices is out of source boundary, it will pad value according you set by the input param padding_mode.