Lua Torch has an nn.gpu() model for distributed processing, is pytorch.nn.DataParallel an abstraction of that?

The original neural_style code is here:

The strategy for distributing computation across multiple GPUs is shown here:

  #neural_style.lua
  local DEFAULT_STRATEGIES = {
    [2] = {3},
  }
  local gpu_splits = nil
  if params.multigpu_strategy == '' then
    -- Use a default strategy
    gpu_splits = DEFAULT_STRATEGIES[#params.gpu]
    -- Offset the default strategy by one if we are using TV
    if params.tv_weight > 0 then
      for i = 1, #gpu_splits do gpu_splits[i] = gpu_splits[i] + 1 end
    end
  else
    -- Use the user-specified multigpu strategy
    gpu_splits = params.multigpu_strategy:split(',')
    for i = 1, #gpu_splits do
      gpu_splits[i] = tonumber(gpu_splits[i])
    end
  end
  assert(gpu_splits ~= nil, 'Must specify -multigpu_strategy')
  local gpus = params.gpu

  local cur_chunk = nn.Sequential()
  local chunks = {}
  for i = 1, #net do
    cur_chunk:add(net:get(i))
    if i == gpu_splits[1] then
      table.remove(gpu_splits, 1)
      table.insert(chunks, cur_chunk)
      cur_chunk = nn.Sequential()
    end
  end
  table.insert(chunks, cur_chunk)
  assert(#chunks == #gpus)

  local new_net = nn.Sequential()
  for i = 1, #chunks do
    local out_device = nil
    if i == #chunks then
      out_device = gpus[1]
    end
    new_net:add(nn.GPU(chunks[i], gpus[i], out_device))
  end

  return new_net

I don’t believe that pytorch has an nn.GPU analog. Should I be looking at torch.nn.DataParallel to achieve this same result? Is cudnn a supported backend for torch.nn.DataParallel? Has anybody seen multi-gpu neural style implemented in pytorch? Any tips would be greatly appreciated, I’m new to pytorch

Thanks!

Hi,

nn.GPU is just a convenient way to run a module on a given GPU. It does not do any multiprocessing.
It is doing (more complex to handle non-Tensor inputs):

def forward(self, input):
    gpu_input = input.cuda(self.gpuid)
    with torch.cuda.device(self.gpuid):
        out = self.mod(gpu_input)
    output = out.cuda(self.out_device)
    return output

What the lua code is doing is actually building a DataParallel by hand I think. So yes you should be using that.

Okay, awesome, I’m glad to hear it’s possible. I’ll familiarize myself further with the items you mentioned and report back on how it goes. Thanks so much!