Issues using Data Parallelism: DataParallel object has no attribute 'xxxxxxxx'

add023 · August 25, 2020, 10:55pm

I have an NLP model and I need to use data parallelism due to large batch data size. I wrapped my model using nn.DataParallel. I need the model attributes else where during training, validation, logging etc. Here’s how I’m doing it:

class Trainer():
    def __init__(self, model, config):
        self.model = model
        if torch.cuda.device_count() > 1:
            print("Let's use", torch.cuda.device_count(), "GPUs!")
            self.model = nn.DataParallel(model)
       
        self.txt_property = self.model.txt_property
        # many more properties

    def do_something(self):
        param = self.model.param

AttributeError: 'DataParallel' object has no attribute 'txt_property'

What is the best practice to encapsulate model with nn.DataParallel?

ayalaa2 · August 25, 2020, 11:00pm

Since you wrapped it inside DataParallel, those attributes are no longer available. You should be able to do something like self.model.module.txt_property to access those variables.

Be careful with altering these values though:

In each forward, module is replicated on each device, so any updates to the running module in forward will be lost. For example, if module has a counter attribute that is incremented in each forward , it will always stay at the initial value because the update is done on the replicas which are destroyed after forward .

add023 · August 26, 2020, 1:23am

Thank you for the reply. Understood about accessing the values. Thanks for the link!