Maximum recursion bug?

I was trying to build a tree like neural net, where there are nested modules in ModuleList/ModuleDict objects. However I encountered a maximum recursion bug in the top node, the ‘root’ of the tree. To make it simple, I created a minimal example to reproduce this error (Pythorch 1.2):


class TreeNode_Test(nn.Module):
    def __init__(self):
        super(TreeNode_Test, self).__init__()
        self.nodesInLevels = nn.ModuleList([self]) 

myModel = TreeNode_Test()
myModel # when calling this or myModel.nodesInLevels I ll get max recursion error:


  File "C:\Users\mk23\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1042, in __repr__
    mod_str = repr(module)

  File "C:\Users\mk23\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1042, in __repr__
    mod_str = repr(module)

  File "C:\Users\mk23\AppData\Local\Continuum\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1036, in __repr__
    extra_repr = self.extra_repr()

RecursionError: maximum recursion depth exceeded
  • Any ideas?
1 Like

Hi,

You model references itself when you use self in your ModuleList. So the print function that tries to print all the module that are contained in your model will run infinitely.

1 Like

That is an explanation but not a solution :slight_smile:

I don’t think this behaviour is correct, if I change the parent class to object instead of nn.Module, or the nn.ModuleList to python list(), then it will work as expected - but then it won’t work with DataParallel. As then it won’t replicate the model properly across multiple GPUs and I will end up with the dreaded tensors/parameters on different GPUs error…

It isn’t just print, (which I could avoid by not calling it), pretty much everything ends up in an infinite loop, eg module.apply(fn) which I can’t avoid using.

What would your use case be?
If you use self as a module inside nn.Module, even the __call__ function will try to recursively call itself.

Hi, I have encountered the same issue when following this parallelism tutorial: Multi-GPU Examples — PyTorch Tutorials 1.7.1 documentationAttributes of the wrapped module

Simple code snippet to reproduce:

>>> import torch
>>> class TorchDataParallel(torch.nn.DataParallel):
...     def __getattr__(self, name):
...         return getattr(self.module, name)
... 
>>> block = torch.nn.Module()
>>> parallel_block = TorchDataParallel(block)
>>> parallel_block.stream_names
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in __getattr__
  File "<stdin>", line 3, in __getattr__
  File "<stdin>", line 3, in __getattr__
  [Previous line repeated 995 more times]
RecursionError: maximum recursion depth exceeded

Wondering what would be the right way to access custom attributes?

Which PyTorch version are you using?
I’ve just tried it out on a ~1 week old source build and don’t get an error.

I get the same on ‘1.7.1+cu101’.
Can get it to work doing:

class DataParallel(torch.nn.parallel.DataParallel):
    def __getattr__(self, name):
        module = object.__getattribute__(self, "_modules")["module"]
        if name == "module":
            return module
        return getattr(module, name)
1 Like