Newbie question ; How do I put this forward call into a model? to get model parameters saved

suegi · October 22, 2021, 2:04pm

x = random_hot
x = x.unsqueeze(1)
x = torch.nn.Embedding(vocab_size,100)(x)
x = torch.nn.Flatten()(x)
print(x.shape)
x = x.unsqueeze(0)

for i in range(5):
x = torch.nn.Conv1d(x.shape[1], 50, 100, stride=1)(x)
x = torch.nn.ReLU()(x)
x = torch.nn.Conv1d(x.shape[1], 50, 100, stride=2)(x)
x = torch.nn.ReLU()(x)

x = torch.nn.Flatten()(x)

x = torch.nn.Linear(x.shape[1],512)(x)
x = torch.nn.ReLU()(x)
x = torch.nn.Linear(512,128)(x)
x = torch.nn.Linear(128,1)(x)

This sends my input through the network without errors. I put this in the forward() call of a new nn.Module object. I left the init empty. When I tried using an optimizer, it gives an error saying model.parameters() are empty.

I thought I may need to put all my layers explicitly in the init part of the module. For instance

self.conv1 = nn.Conv1d( …)
self.conv2 = nn.Conv1d( …)
…
self.conv10 = nn.Conv1d( …)

Is this the case and if so, how would I encode the dynamic shape of x.shape[1] which my loop relies on? Is this even how I get model.parameters to be saved so I can use an optimizer?

my3bikaht · October 22, 2021, 2:20pm

You need to create instances of all your layers in constructor. Then call these instances in forward pass.

def __init__(self):
    self.conv1 = nn.Conv2d(1,32,5)

def forward(self, x):
    x = self.conv1(x)

as for the dynamic shape you can use input with largest shape possible and pad it with zeroes

suegi · October 22, 2021, 3:58pm

Ah… I rewrote it and can train my model now…

class CNNtext(nn.Module):
def init(self, vocab_size):
super(CNNtext, self).init()

    self.embed = torch.nn.Embedding(vocab_size, 100)
    self.flat1 = torch.nn.Flatten()
    
    
    self.conv1 = torch.nn.Conv1d(1,50,100,stride=1)
    self.relu1 = torch.nn.ReLU()
    
    self.conv2 = torch.nn.Conv1d(50,50,100,stride=2)
    self.relu2 = torch.nn.ReLU()
    
    self.conv3 = torch.nn.Conv1d(50,50,100,stride=1)
    self.relu3 = torch.nn.ReLU()
    
    self.conv4 = torch.nn.Conv1d(50,50,100,stride=2)
    self.relu4 = torch.nn.ReLU()
    
    self.flatten2 = torch.nn.Flatten()
    
    self.linear1 = torch.nn.Linear(383850,512)
    self.relu5 = torch.nn.ReLU()
    self.output = torch.nn.Linear(512,1)


def forward(self,x):

    x = self.embed(x)
    x = self.flat1(x)
    x = x.unsqueeze(0)
    x = self.conv1(x)
    x = self.relu1(x)
    x = self.conv2(x)
    x = self.relu2(x)
    x = self.conv3(x)
    x = self.relu3(x)
    x = self.conv4(x)
    x = self.relu4(x)
    x = self.flatten2(x)
    x = self.linear1(x)
    x = self.relu5(x)
    x = self.output(x)
    
    return x

I had to build it step by step to find out what size to use for self.linear1… now I have just 2 questions

How can I build self.linear1() without having to build the network up to see what size it expects?
How can I put these conv blocks in a loop. Am I suppose to build a 2nd model and call it within the main model? ( but doing so I would need to again know the answer for question 1) - or can I add a method in my main model to build the layer blocks?

my3bikaht · October 22, 2021, 4:22pm

Math for convolutional layers widh kernel=1 and stride=1 padding=0 will keep same wxh. kernel=3, stride=2, padding=1 will halve them.
You can use loops. Or create inner models and call them. Take a look at source code of resnet, for example.

suegi · October 30, 2021, 10:48am

TIL lazyLinear does the trick without the math