Hello, I implemented depth warping just using inverse_warp.py from DPSNet for 480*640 image and its depth map
I want to create a binary map with the target coordinates of depth warping as 1 and the other coordinates as 0. In other words, it means displaying the occlusion-exposed background as 0 and the non-occlusion as 1. (What I want is not an occlusion mask, but it represents occlusion by marking the region that was occluded as 0.)
src_pixel_coords = cam2pixel(cam_coords, proj_cam_to_src_pixel[:,:,:3], proj_cam_to_src_pixel[:,:,-1:], padding_mode) # [B,H,W,2] projected_feat = torch.nn.functional.grid_sample(feat, src_pixel_coords, padding_mode=padding_mode) return projected_feat, src_pixel_coords
To achieve this, I used invers_warp.py to return src_pixel_coords (i.e. target coordinates), which is passed as an argument to the torch.nn.functional.grid_sample() function.
From this, src_pixel_coords value, I can produce the binary map I desire by following numpy code:
warped_img, src_pixel_coords = inverse_warp( ... ) img_height = 480 img_width = 640 coords = (src_pixel_coords + 1) / 2 #0-1 normalize x_coords = coords[0,:,:,0] * img_width #batch_size = 1 y_coords = coords[0,:,:,1] * img_height x_coords = x_coords.floor().clamp(0,img_width-1).cpu().numpy() y_coords = y_coords.floor().clamp(0,img_height-1).cpu().numpy() mask = torch.zeros(img_height, img_width) for y in range(img_height): for x in range(img_width): y_ = np.clip(2*y-int(y_coords[y,x]), 0, img_height-1) x_ = np.clip(2*x-int(x_coords[y,x]), 0, img_width-1) mask[y_, x_] = 1
And the output:
It works properly and produce binary map well. However, there are several problems with this code.
- It is implemented in numpy, therefore it can’t be computed on GPU.
- Of course, it doesn’t work when batch size> 1.
- Bad time complexity.
I would like to implement this numpy code as functions for a pytorch tensor and solve the above three problems. However, I am not sure how to solve this problem. Can you help?