Changing Linear layers to RNN

Hi :),

I’, using Alexnet to recognice euler angles of Persons in front of a Camera. This is my model:
class AlexNet(nn.Module):

def __init__(self, num_classes=1000):
    super(AlexNet, self).__init__()
    self.features = nn.Sequential(
        nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
        nn.MaxPool2d(kernel_size=3, stride=2),
        nn.Conv2d(64, 192, kernel_size=5, padding=2),
        nn.MaxPool2d(kernel_size=3, stride=2),
        nn.Conv2d(192, 384, kernel_size=3, padding=1),
        nn.Conv2d(384, 256, kernel_size=3, padding=1),
        nn.Conv2d(256, 256, kernel_size=3, padding=1),
        nn.MaxPool2d(kernel_size=3, stride=2),
    self.classifier = nn.Sequential(
        nn.Linear(256 * 6 * 6, 4096),
        nn.Linear(4096, 4096),
        nn.Linear(4096, num_classes),

def forward(self, x):
    x = self.features(x)
    x = x.view(x.size(0), 256 * 6 * 6)
    x = self.classifier(x)

    return x

def alexnet(pretrained=True, **kwargs):
model = AlexNet(**kwargs)
if pretrained:
model.classifier._modules[‘6’] = nn.Linear(4096, 3)
return model

i modified the last Layer to only 3 outputs (the angles). My Goal now is to replace the 3 Linear Layers (The 2 in die classifier and my changed one) with RNN Layers. Preferable still trained (of course only the ih weight. The hh weigts needs to be trained. And of cource only from the first 2 Layers. My changed 3 Layer needs to be trained completly) To test this i tried to change my changed Leniar Layer(4069, 3) to a RNN Layer(4069, 3) Just like this:

model.classifier._modules[‘6’] = nn.RNN(4096, 3)

But when I’m trying to do so, I get this error:

File “/home/jan/anaconda3/envs/TensorboardX/lib/python3.6/site-packages/torch/nn/modules/”, line 178, in forward
self.check_forward_args(input, hx, batch_sizes)
File “/home/jan/anaconda3/envs/TensorboardX/lib/python3.6/site-packages/torch/nn/modules/”, line 126, in check_forward_args
expected_input_dim, input.dim()))
RuntimeError: input must have 3 dimensions, got 2

My Input is a [32, 3, 244, 244] Tensor.

What do I do wrong?

Although I don’t know what you’re trying to do, but nn.RNN needs inputs to be 3 dimensions.

Inputs: input, h_0
input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence. See torch.nn.utils.rnn.pack_padded_sequence() or torch.nn.utils.rnn.pack_sequence() for details.

I use Alexnet to estimate the angle of a persons head in 3D in Realtime (e.g. Person in front of Camera). I want to improve this estimation by using an RNN. This should improve the estimation, because the frames are sequential. (Hope this is the right word. My english is not so good as you probrably noticed :smiley: )

First i implemented alexnet normally. But now I’m trying to replace all 3 Linear Layers in “classifier” (see above) with RNN Layers.

Because I’m using a Pretrained Alexnet I need to change the Layers after loading the model. I’m allready did this to simplify the output to only the 3 angles. ( model.classifier._modules[‘6’] = nn.Linear(4096, 3) )

I now changed all 3 Layers to RNNs
model.classifier._modules[‘1’] = nn.RNN(256 * 6 * 6, 4096, batch_first=True)
model.classifier._modules[‘4’] = nn.RNN(4096, 4096, batch_first=True)
model.classifier._modules[‘6’] = nn.RNN(4096, 3, batch_first=True)

(I know these Layers are now not pretrained any more. I still have to figure out how to copy the pretrained weights to the right spots)

If i understand it right, i need batch_first because of my [32, 3, 244, 244] Inputtensor. 32 is Batchsize, and 3,244,244 is the frame

I also modified out = model(input) to out, hidden = model(input, hidden). I set hidden = None for the first batch.

And now I get this error:
Traceback (most recent call last):
File “/home/jan/anaconda3/envs/TensorboardX/lib/python3.6/”, line 916, in _bootstrap_inner
File “/home/jan/anaconda3/envs/TensorboardX/lib/python3.6/”, line 864, in run
self._target(*self._args, **self._kwargs)
File “/home/jan/PycharmProjects/Bachelor-Kopfposenschaetzung/”, line 221, in train
out, hidden = model(input, hidden)
File “/home/jan/anaconda3/envs/TensorboardX/lib/python3.6/site-packages/torch/nn/modules/”, line 491, in call
result = self.forward(*input, **kwargs)
TypeError: forward() takes 2 positional arguments but 3 were given

Is my Input in a wrong shape? @klory wrote, that I need an Input of shape (seq_len, batch, input_size), but I actually don’t know how this means. Is this the my input in the line out, hidden = model(input, hidden) or is a different input meant?

Hope It is understandable what my Problem is. Thanks

Ah I found one mature problem. The classifier of the Alexnet is a Sequentiell model. If I unterstand that right, that means, that this does not support RNNs.

could you show the code how you modify the AlexNet with RNN