Hi,
I have a problem while using multiple models in a loop inside a nn.Module
, an example script below shows my problem:
from torch import nn
import torch
class test_module(nn.Module):
def __init__(self):
super().__init__()
self.model = nn.Sequential(*[nn.Linear(10,10), nn.ReLU()])
def forward(self, x):
return self.model(x)
class test_model(nn.Module):
def __init__(self, num = 3):
super().__init__()
self.layers = list()
for i in range(num):
self.layers.append(test_module())
def forward(self, x):
for l in self.layers:
out = l(x)
return out
model = test_model()
print(f'Model architecture: {model}')
input = torch.rand(3,10)
out = model(input)
print(f'Output shape: {out.shape}')
The output is:
Model architecture: test_model()
Output shape: torch.Size([3, 10])
While I expect the model should print the combination of three test_module
. This problem also affects when I tried to move the model to CUDA by:
model.cuda()
input= input.cuda()
out = model(input)
Then the bug raises:
RuntimeError: Tensor for 'out' is on CPU, Tensor for argument #1 'self' is on CPU, but expected them to be on GPU (while checking arguments for addmm)
Can someone show me where am I wrong and how to fix it? Thank you.