The chunks
input is a list of nn.Sequential
networks from a model I have divided up to run on multiple GPUs/CPUs. The device_list
input is a list of devices, like for example ['cuda:0', 'cuda:1', 'cuda:2', 'cuda:3']
, or ['cpu', 'cuda:0', 'cuda:1']
. Both lists will have the same number of values.
Unfortunately I don’t have multiple GPUs, and online services can be rather expensive for multiple GPUs. So, I want to make sure that I have things right before I try to run the code. I created the following class to run a single model with a batch size of 1 across multiple devices. Basically each device is supposed to run part of a model before passing the output to the next device.
I put the chunks onto their devices:
for i, chunk in enumerate(chunks):
chunk.to(device_list[i])
And then I pass them to my class:
class ModelParallelModel(nn.Sequential):
def __init__(self, chunks, device_list):
super(ModelParallelModel, self)
self.chunks = chunks
self.device_list = device_list
print(str(len(self.chunks)))
print(self.device_list)
def forward(self, input):
for i, chunk in enumerate(chunks):
if i < len(chunks) -1:
input = chunk(input.to(device_list[i]) ).to(device_list[i+1])
else:
input = chunk(input.to(device_list[i]))
return input
These lines of code are where I create the model and then try to run:
net = ModelParallelModel(chunks, device_list)
net(test_image)
Though I get this error:
4
['cuda:0', 'cuda:1', 'cuda:2', 'cuda:3']
Traceback (most recent call last):
File "test.py", line 465, in <module>
main()
File "test.py", line 164, in main
net(test_image)
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 538, in __call__
for hook in self._forward_pre_hooks.values():
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 591, in __getattr__
type(self).__name__, name))
AttributeError: 'ModelParallelModel' object has no attribute '_forward_pre_hooks'
How do I fix this error? I can’t seem to find anything about what this error means, or how to fix it anywhere.