Dataparallel model with custom functions

The dataparallel tutorial states that if we want to invoke custom functions we made in our model. We’d have to wrap our model into a subclass of data parallel where the subclass is supposed to look something like this.

class MyDataParallel(nn.DataParallel):
    def __getattr__(self, name):
        return getattr(self.module, name)

However when I do this, I get the following error

 File "/.autofs/tools/spack/var/spack/environments/ganvoice/.spack-env/view/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/.autofs/tools/spack/var/spack/environments/ganvoice/.spack-env/view/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 142, in forward
    for t in chain(self.module.parameters(), self.module.buffers()):
  File "acai.py", line 27, in __getattr__
    return getattr(self.module, name)
  File "acai.py", line 27, in __getattr__
    return getattr(self.module, name)
  File "acai.py", line 27, in __getattr__
    return getattr(self.module, name)
  [Previous line repeated 327 more times]

DataParallel is inherited from nn.Module, you cannot overwrite setattr and getattr
as they are used to make the whole machinery to work

In general nn.Module is a complex class. Don’t try to modify it.

I’m training autoencoder models. So it will have 2 networks within a class and the only way to avoid having to subclass DataParallel would be to include loss functions within the model which I don’t like at all. Since I may change the loss functions later and they are independent of the model.

What are your suggestions in such a case.

Can you explain what are you trying to do?
If you want to acess submodules you can just call instance.module.whatver so you can still acess to the objects
This feels

return getattr(self.module, name)

like a by-pass not to have to call instance.module.whatever but directly instance.whatever
I understand it’s more confortable but problematic.

I’m trying to do what is suggested by the PyTorch tutorial. I have multiple functions for forward pass which don’t have the name “forward”. I need to invoke all the forward pass functions to train my model. This is not possible if I make it DataParallel. The PyTorch suggestion didn’t work for me and I was wondering if there are any alternatives. which would allow multi-gpu usage while keeping the code single GPU compatible.

You can properly call them by using self during the forward pass.
What you cannot do is to call them externally since that way you are not splitting the inputs among the gpus available.

You aren’t either supposed to call model.forward(inputs) but model(inputs) as interally it execute model.__call__() and forward is run in __call__

So in short your scheme should be

class Model()
  def init
  def function1
  def function 2
  def forward(inputs);
      x=self.function1(inputs)
      x=self.function2(x)


model = DataParallel(Model(args,kwargs)).cuda()
output= model(input)

That way whatever you code will work but if you do

model = DataParallel(Model(args,kwargs)).cuda()
a=model.module.function1(input)
b=model.module.function2(input)
or even 
out=model.forward(input)
it won't work because you are skipping to run the code inside
model.__call__()

I already know all the information you are giving me. You are still not addressing the PyTorch workaround for the situation. The workaround is not working for me,(the code for which I have shared in the original post). In case you don’t know which PyTorch tutorial I was referring to,have a look at this.

Sorry, it’s not like I’m not addressing it but I don’t recomend it.
The main issue here is that module is a nn.Module. Then it’s hidden in _modules private dict. The way they designed to gather nn.Modules is through getattr. As you are overwritting getattr it gets into a infinity recursion.

I really really doubt that workaround won’t lead to further bugs. You can fix it doing the following. Basically I am invoking the parent getattr method (the way they originally created to gather nn.Modules) in case you try to get the object ‘module’. Otherwise it directly goes to the children’s objects.

class MyDataParallel(nn.DataParallel):
    def __getattr__(self, name):
        if name == 'module':
            return super().__getattr__('module')
        else:
            return getattr(self.module, name)

You can also do:

class MyDataParallel(nn.DataParallel):
    def __getattr__(self, name):
        if name == 'module':
            return self._modules['module']
        else:
            return getattr(self.module, name)

But anyway if you want a safer code I would do the following.

class MyDataParallel(nn.DataParallel):
    def __init__(self, my_methods, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self._mymethods = my_methods

    def __getattr__(self, name):
        if name in self._mymethods:
            return getattr(self.module, name)

        else:
            return super().__getattr__(name)

You just need to pass a list of the methods you defined. This way you ensure that if it’s one of your methods it comes from the proper object. The same error you found for a nn.Module will raise if you store tensors in the DataParallel object (as those objects are also hidden in _buffers and _parameters and properly gathered through getattr)

Btw @albanD could you have a look at it to fix the tutorial?

2 Likes

I’m having trouble using nn.DataParallel with a custom model. I followed the guidance in this tutorial, but when I run the model, it is not actually using both GPUs, and my first GPU fills up. I’m not sure how to incorporate the guidance from this forum into my code. Here is my model and what I’m using to make it parallel, taken from the tutorial:

# specify NN
class SegNet(nn.Module):
    def __init__(self, params):
        super(SegNet, self).__init__()

        C_in, H_in, W_in = params["input_shape"]
        init_f = params["initial_filters"]
        num_outputs = params["num_outputs"]

        self.conv1 = nn.Conv2d(C_in, init_f, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(init_f, 2 * init_f, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(2 * init_f, 4 * init_f, kernel_size=3, padding=1)
        self.conv4 = nn.Conv2d(4 * init_f, 8 * init_f, kernel_size=3, padding=1)
        self.conv5 = nn.Conv2d(8 * init_f, 16 * init_f, kernel_size=3, padding=1)

        self.upsample = nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True)

        self.conv_up1 = nn.Conv2d(16 * init_f, 8 * init_f, kernel_size=3, padding=1)
        self.conv_up2 = nn.Conv2d(8 * init_f, 4 * init_f, kernel_size=3, padding=1)
        self.conv_up3 = nn.Conv2d(4 * init_f, 2 * init_f, kernel_size=3, padding=1)
        self.conv_up4 = nn.Conv2d(2 * init_f, init_f, kernel_size=3, padding=1)

        self.conv_out = nn.Conv2d(init_f, num_outputs, kernel_size=3, padding=1)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)

        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)

        x = F.relu(self.conv3(x))
        x = F.max_pool2d(x, 2, 2)

        x = F.relu(self.conv4(x))
        x = F.max_pool2d(x, 2, 2)

        x = F.relu(self.conv5(x))

        x = self.upsample(x)
        x = F.relu(self.conv_up1(x))

        x = self.upsample(x)
        x = F.relu(self.conv_up2(x))

        x = self.upsample(x)
        x = F.relu(self.conv_up3(x))

        x = self.upsample(x)
        x = F.relu(self.conv_up4(x))

        output = self.conv_out(x)

        return output


# specify model parameters
params_model = {
    "input_shape": (3, h, w),
    "initial_filters": 16,
    "num_outputs": 1,
}

model = SegNet(params_model)

# tell pytorch to use both gpus
if torch.cuda.device_count() > 1:
  print("Let's use", torch.cuda.device_count(), "GPUs!")
  model = nn.DataParallel(model)

model.to(device)

But then when I try training with my dataset, I get this error because it is only using one GPU. When I use a similar strategy with one of the torchvision models, it runs perfectly.

RuntimeError: CUDA out of memory. Tried to allocate 28.00 MiB (GPU 0; 11.00 GiB total capacity; 9.90 GiB already allocated; 23.01 MiB free; 9.98 GiB reserved in total by PyTorch)

With this solution too, it’s not possible to call custom (non-“forward”) methods while ensuring data-parallelism (the whole batch ends up getting passed in the custom function call). One solution for this can be to call the custom functions from inside forward by doing a simple if-else where you just figure out which function needs to be called. This can be done by passing the function name to forward and calling the necessary function by comparing against a set of hardcoded names. One other thing, each of the tensors in the call to the forward should have batchsize in the 0th dimension (this is sometimes not the case when using RNNs – this FAQ(https://pytorch.org/docs/stable/notes/faq.html#pack-rnn-unpack-with-data-parallelism) also talks about it)