Torch.nn.functional.grid_sample error in multi-gpu environment

It’s so weird. I wanna use grid_sample in forward function, if I use one gpu, there is no problem, but if I use two gpus, the training stage would be stuck before grid_sample function. I really have no idea about it. Does anyone know why? Thanks! (pytorch0.4)