Packing list of model into one model

Toby · August 16, 2021, 12:13pm

Hi,

I have a problem while using multiple models in a loop inside a nn.Module, an example script below shows my problem:

from torch import nn 
import torch

class test_module(nn.Module):
    def __init__(self):
        super().__init__()
        self.model = nn.Sequential(*[nn.Linear(10,10), nn.ReLU()])
    def forward(self, x):
        return self.model(x)
        
class test_model(nn.Module):
    def __init__(self, num = 3):
        super().__init__()
        self.layers = list()
        for i in range(num):
            self.layers.append(test_module())
    def forward(self, x):
        for l in self.layers:
            out = l(x)
        return out
    
model = test_model()
print(f'Model architecture: {model}')

input = torch.rand(3,10)
out = model(input)
print(f'Output shape: {out.shape}')

The output is:

Model architecture: test_model()
Output shape: torch.Size([3, 10])

While I expect the model should print the combination of three test_module. This problem also affects when I tried to move the model to CUDA by:

model.cuda()
input= input.cuda()
out = model(input)

Then the bug raises:

RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)

Can someone show me where am I wrong and how to fix it? Thank you.

Toby · August 16, 2021, 12:19pm

Sorry, for those who don’t know, we have to use nn.ModuleList() instead of list()