Performance discrepancy in masking operation on GPU-initialized vs. CPU-to-GPU moved tensors

Hello, PyTorch Community,

I’ve encountered a perplexing performance issue when applying a boolean mask to a tensor, and I’m hoping to get some insights or suggestions on how to address it. The core of the problem lies in the significant time difference in executing a masking operation, depending on how the tensor involved is initialized and where it resides (CPU vs. GPU).

Here’s a simplified version of the scenario:


import torch

# Assume num_rays, original_num_light_directions are predefined integers
# visibility is a tensor already on GPU
# Assume mask is predefined and compatible with total_vis dimensions

# Scenario 1: Initializing tensor on CPU, then moving to GPU
total_vis_cpu_to_gpu = torch.ones(num_rays, original_num_light_directions, 1).type_as(visibility)
total_vis_cpu_to_gpu[mask] = visibility

# Scenario 2: Directly initializing tensor on GPU
total_vis_gpu = torch.ones(num_rays, original_num_light_directions, 1, device=visibility.device, dtype=visibility.dtype)
total_vis_gpu[mask] = visibility

In both cases, total_vis[mask] = visibility is the operation of interest. However, the performance drastically differs:

  • In Case 1, where total_vis is initialized on the CPU and then moved to the GPU, the operation executes relatively fast.
  • In Case 2, when total_vis is initialized directly on the GPU, the same operation takes significantly longer.

This discrepancy is puzzling because the operation that applies the mask to modify total_vis is the same in both scenarios, yet the performance impact is quite different. I initially thought the delay in Case 2 could be due to how memory is allocated or managed on the GPU when tensors are directly initialized there, but I’m not entirely sure.

Has anyone experienced a similar issue or can provide insight into why this might be happening? Any suggestions on how to mitigate this performance difference would be greatly appreciated.

Thank you in advance for your time and help.