I’m trying to make “dynamic layering”, so that I can enter parameters when creating a model to easily vary layer/neuron size and try many different settings to find what’s optimal through brute force (cuz math is hard, and takes brainpower). This works well on the CPU! But it doesn’t work when I try to load it onto cuda, for some reason. I’m suspecting the list holding the different layers cannot be loaded to cuda.
Here’s example code of working and not working concepts:
import torch import torch.nn as nn import torch.nn.functional as F class Working(nn.Module): def __init__(self): super(Working, self).__init__() self.inp = nn.Linear(784, 30) self.l0 = nn.Linear(30, 30) self.l1 = nn.Linear(30, 30) self.l2 = nn.Linear(30, 30) self.l3 = nn.Linear(30, 30) self.l4 = nn.Linear(30, 10) def forward(self, x): x = self.inp(x) x = self.l0(x) x = self.l1(x) x = self.l2(x) x = self.l3(x) x = self.l4(x) return F.softmax(x, dim=-1) class Not_Working(nn.Module): def __init__(self, am_l, am_n): super(Not_Working, self).__init__() self.inp = nn.Linear(784, am_n) self.h_layers =  for i in range(am_l): o_lay = 10 if i == am_l-1 else am_n self.h_layers.append(nn.Linear(am_n, o_lay)) def forward(self, x): x = self.inp(x) for layer in self.h_layers: x = layer(x) return F.softmax(x, dim=1) use_cuda = torch.cuda.is_available() device = torch.device("cuda" if use_cuda else "cpu") #This works w = Working().to(device) x2 = torch.randn(784).to(device) y2 = w(x2) #This doesn't work n_w = Not_Working(5, 30).to(device) x = torch.randn(784).to(device) y = n_w(x) #RuntimeError: Expected object of backend CUDA but got backend CPU for argument #2 'mat2'
Am I doing something wrong? According to my own logic, the list of layers should be transferred to cuda using Not_Working(3,30).->to(device)<- but it doesn’t seem to work. Should I try to modify the .to() function to include lists somehow?