[resolved] Broadcasting Variables Across GPUs and Autograd

HANG_ZHANG · July 11, 2017, 12:46am

I am Broadcasting the Variables to different GPUs using the function https://github.com/pytorch/pytorch/blob/master/torch/nn/parallel/_functions.py#L6. Will the gradient automatically aggregated during the backward? I am getting errors when doing backward, not sure if it is because of broadcasting incorrectly handled. Thanks in advance!

HANG_ZHANG · July 14, 2017, 12:13am

It works

github.com

pytorch/pytorch/blob/master/torch/nn/parallel/_functions.py#L6


import torch
import torch.cuda.comm as comm
from torch.autograd import Function




class Broadcast(Function):


@staticmethod
def forward(ctx, target_gpus, *inputs):
    if not all(input.is_cuda for input in inputs):
        raise TypeError('Broadcast function not implemented for CPU tensors')
    ctx.target_gpus = target_gpus
    if len(inputs) == 0:
        return tuple()
    ctx.num_inputs = len(inputs)
    ctx.input_device = inputs[0].get_device()

Sean · May 3, 2018, 2:02am

I make sure all inputs into forward function are cuda type. However, it still has this error …