Get class attributes for model wrapped in DataParallel

Rafael_Valle · October 26, 2017, 4:03pm

Hi all.

How can I access self.znoise in the class below, including when the model is wrapped in DataParallel?
I have tried looking inside of dir(model.module) but haven’t been able to find any container that might have that attribute!

from sacred import Ingredient
model_ingredient = Ingredient('model')
class Model(nn.Module):
    @model_ingredient.capture
    def __init__(self, size):
        super(Model, self).__init__()
        self.znoise = torch.FloatTensor(1, size, 1)
        if torch.cuda.is_available():
            self.znoise = self.znoise.cuda()
        self.encoder = nn.ModuleList([neuralnet])

SimonW · October 26, 2017, 4:09pm

data_paralleled_module.module.variable surely worked for me. Could you double check?

>>> class Model(nn.Module):
...     def __init__(self, variable):
...         super(Model, self).__init__()
...         self.variable = variable
...
>>> m = Model(Variable(torch.randn(3,4)))
>>> dpm = nn.DataParallel(m)
>>> dpm.module
Model (
)
>>> dpm.module.variable
Variable containing:
 0.8198  1.3578 -1.9757 -0.4722
 0.0324  0.5272 -0.3619 -2.0327
-0.2448 -0.3940 -1.0660 -0.4111
[torch.FloatTensor of size 3x4]

>>> dir(dpm.module)
['__call__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattr__', '__getattribute__', '__gt__', '__hash__', '__init
__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__s
tr__', '__subclasshook__', '__weakref__', '_all_buffers', '_apply', '_backend', '_backward_hooks', '_buffers', '_forward_hooks', '_forward_pre_hooks', '_modules', '_paramete
rs', 'add_module', 'apply', 'children', 'cpu', 'cuda', 'double', 'dump_patches', 'eval', 'float', 'forward', 'half', 'load_state_dict', 'modules', 'named_children', 'named_m
odules', 'named_parameters', 'parameters', 'register_backward_hook', 'register_buffer', 'register_forward_hook', 'register_forward_pre_hook', 'register_parameter', 'share_me
mory', 'state_dict', 'train', 'training', 'type', 'variable', 'zero_grad']
>>>

Rafael_Valle · October 26, 2017, 4:19pm

Interestingly that doens’t work for me. Please look at the updated question, I added more info to the code. In that same Model class I also have a self.encoder atribute that is a ModulelList. Neither model.module.encoder nor model.module.znoise work.
I get AttributeError: ‘DataPrallel’ object has no attribute ‘znoise’.
I’m running wih 2 GPUs, if that matters!

SimonW · October 26, 2017, 4:31pm

Interesting, the almost identical code worked for me:

>>> class Model(nn.Module):
...     def __init__(self, size):
...         super(Model, self).__init__()
...         self.znoise = torch.FloatTensor(1, size, 1)
...         if torch.cuda.is_available():
...             self.znoise = self.znoise.cuda()
...         self.encoder = nn.ModuleList([nn.Linear(3,4)])
...
>>> m = Model(2)
>>> m.encoder
ModuleList (
  (0): Linear (3 -> 4)
)
>>> nn.DataParallel(m).module.encoder
ModuleList (
  (0): Linear (3 -> 4)
)
>>>

What is @model_ingredient.capture? Is it doing some magic? Could you remove it and try again?

Rafael_Valle · October 26, 2017, 4:54pm

That ingredient.capture is from Sacred, a tool for managing experiments. I removed it from the network and still get the same error. dir(model.module) gives the output below. I’m storing the model on the GPU model = torch.nn.DataParallel(model).cuda()

[‘call’, ‘class’, ‘delattr’, ‘dict’, ‘dir’, ‘doc’, ‘eq’, ‘format’, ‘ge’, ‘getattr’, ‘getattribute’, ‘gt’, ‘hash’, ‘init’, ‘le’, ‘lt’, ‘module’, ‘ne’, ‘new’, ‘reduce’, ‘reduce_ex’, ‘repr’, ‘setattr’, ‘setstate’, ‘sizeof’, ‘str’, ‘subclasshook’, ‘weakref’, ‘_all_buffers’, ‘_apply’, ‘_backend’, ‘_backward_hooks’, ‘_buffers’, ‘_forward_hooks’, ‘_forward_pre_hooks’, ‘_modules’, ‘_parameters’, ‘add_module’, ‘apply’, ‘children’, ‘cpu’, ‘cuda’, ‘device_ids’, ‘dim’, ‘double’, ‘dump_patches’, ‘eval’, ‘float’, ‘forward’, ‘gather’, ‘half’, ‘load_state_dict’, ‘module’, ‘modules’, ‘named_children’, ‘named_modules’, ‘named_parameters’, ‘output_device’, ‘parallel_apply’, ‘parameters’, ‘register_backward_hook’, ‘register_buffer’, ‘register_forward_hook’, ‘register_forward_pre_hook’, ‘register_parameter’, ‘replicate’, ‘scatter’, ‘share_memory’, ‘state_dict’, ‘train’, ‘training’, ‘type’, ‘zero_grad’]

SimonW · October 26, 2017, 5:44pm

Try moving cuda() inside the DataParallel

Rafael_Valle · October 26, 2017, 5:57pm

Same error. Let me know if there’s anything else I can provide!
I just confirmed that without doing DataParallel the command does work.

Rafael_Valle · October 27, 2017, 1:59am

This error was happening because I was calling DataParallel twice on the same model!
Thanks for your help!

SimonW · October 27, 2017, 2:35pm

Glad that you sorted it out!!