# How to predict the matrix before the last layer?

I have trained model with layers stacks in nn.Sequential for classification problem.
The ConvNet architecture look like this:

``````class ConvNet(nn.Module):
def __init__(self,num_classes=8):
super(ConvNet,self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(1,64,kernel_size=7),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.AvgPool2d(kernel_size=2,stride=2)
)
self.layer2 = nn.Sequential(
nn.Conv2d(64,128,kernel_size=7,stride=2),
nn.BatchNorm2d(128),
nn.ReLU(),
nn.AvgPool2d(kernel_size=2,stride=2)
)
self.layer3 = nn.Sequential(
nn.Conv2d(128,256,kernel_size=3),
nn.BatchNorm2d(256),
nn.ReLU(),
nn.AvgPool2d(kernel_size=2,stride=2)
)
self.layer4 = nn.Sequential(
nn.Conv2d(256,512,kernel_size=3),
nn.BatchNorm2d(512),
nn.ReLU(),
nn.AvgPool2d(kernel_size=2,stride=2),
nn.BatchNorm2d(512)
)
self.hidden = nn.Linear(2*2*512,1024)
self.drop = nn.Dropout(0.6)
self.dense1 = nn.Sequential(
nn.Linear(1024,256),
nn.ReLU(),
nn.Dropout(0.25)
)
self.dense2 = nn.Sequential(
nn.Linear(256,64),
nn.ReLU()
)
self.fc1 = nn.Linear(64,32)
self.fc = nn.Linear(32,num_classes)

def forward(self,x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = out.reshape(out.size(0),-1)
out = F.relu(self.hidden(out))
out = self.drop(out)
out = self.dense1(out)
out = self.dense2(out)
out = F.relu(self.fc1(out))
out = self.fc(out)
return out
``````

ConvNet(
(layer1): Sequential(
(0): Conv2d(1, 64, kernel_size=(7, 7), stride=(1, 1))
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(layer2): Sequential(
(0): Conv2d(64, 128, kernel_size=(7, 7), stride=(2, 2))
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(layer3): Sequential(
(0): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1))
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(layer4): Sequential(
(0): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1))
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(hidden): Linear(in_features=2048, out_features=1024, bias=True)
(drop): Dropout(p=0.6, inplace=False)
(dense1): Sequential(
(0): Linear(in_features=1024, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.25, inplace=False)
)
(dense2): Sequential(
(0): Linear(in_features=256, out_features=64, bias=True)
(1): ReLU()
)
(fc1): Linear(in_features=64, out_features=32, bias=True)
(fc): Linear(in_features=32, out_features=8, bias=True)
)

#### Then, I have used this method delete the last layer in order to obtain the matrix 32:

``````model = ConvNet(8).to(device)
removed = list(model.children())[:-1]
new_model= torch.nn.Sequential(*removed)
print(new_model)
``````

Output:
Sequential(
(0): Sequential(
(0): Conv2d(1, 64, kernel_size=(7, 7), stride=(1, 1))
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(1): Sequential(
(0): Conv2d(64, 128, kernel_size=(7, 7), stride=(2, 2))
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(2): Sequential(
(0): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1))
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
)
(3): Sequential(
(0): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1))
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU()
(4): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(4): Linear(in_features=2048, out_features=1024, bias=True)
(5): Dropout(p=0.6, inplace=False)
(6): Sequential(
(0): Linear(in_features=1024, out_features=256, bias=True)
(1): ReLU()
(2): Dropout(p=0.25, inplace=False)
)
(7): Sequential(
(0): Linear(in_features=256, out_features=64, bias=True)
(1): ReLU()
)
(8): Linear(in_features=64, out_features=32, bias=True)
)

However, when I want to predict new picture to get matrix 32, I get an error:

#### RuntimeError: size mismatch, m1: [1024 x 2], m2: [2048 x 1024] at C:/w/1/s/tmp_conda_3.8_075429/conda/conda-bld/pytorch_1579852542185/work/aten/src\THC/generic/THCTensorMathBlas.cu:290

Something was wrong here?
How can i get the maxtrix 32 before last layer classifies?

I assume youâ€™ve tried to create the new model by wrapping the child modules into an `nn.Sequential` container.
If thatâ€™s the case, note that you will lose all functional API calls from the `forward` method in your original model, e.g. `out = out.reshape(out.size(0),-1)` as well as the `F.relu` calls.
Thus you should add them via e.g. `nn.Flatten`, `nn.ReLU`.

Alternatively, you could manipulate the `forward` method or use forward hooks to get the desired activation.

1 Like

Can I use this method for replace `out = out.reshape(out.size(0),-1)` ?

``````class Flatten(nn.Module):
def forward(self, input):
return input.view(input.size(0), -1)
``````

Then, forward function look like this:

``````def forward(self,x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = Flatten(out)
out = nn.ReLU(self.hidden(out))
out = self.drop(out)
out = self.dense1(out)
out = self.dense2(out)
out = nn.ReLU(self.fc1(out))
out = self.fc(out)
return out
``````

Is this right ?

Generally yes, but you would have to create instances of these layers or use the functional API in your `forward` method:

``````def forward(self,x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = Flatten()(out)
out = nn.ReLU()(self.hidden(out))
out = self.drop(out)
out = self.dense1(out)
out = self.dense2(out)
out = nn.ReLU()(self.fc1(out))
out = self.fc(out)
return out
``````

I have followed your `forward` function, but I still get the same error.
I called all parameter in file test.py like this:

``````import torch
import torch.nn as nn

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
class Flatten(nn.Module):
def forward(self, input):
return input.view(input.size(0), -1)

class ConvNet(nn.Module):
def __init__(self,num_classes=8):
super(ConvNet,self).__init__()
self.layer1 = nn.Sequential(
nn.Conv2d(1,64,kernel_size=7),
nn.BatchNorm2d(64),
nn.ReLU(),
nn.AvgPool2d(kernel_size=2,stride=2)
)
self.layer2 = nn.Sequential(
nn.Conv2d(64,128,kernel_size=7,stride=2),
nn.BatchNorm2d(128),
nn.ReLU(),
nn.AvgPool2d(kernel_size=2,stride=2)
)
self.layer3 = nn.Sequential(
nn.Conv2d(128,256,kernel_size=3),
nn.BatchNorm2d(256),
nn.ReLU(),
nn.AvgPool2d(kernel_size=2,stride=2)
)
self.layer4 = nn.Sequential(
nn.Conv2d(256,512,kernel_size=3),
nn.BatchNorm2d(512),
nn.ReLU(),
nn.AvgPool2d(kernel_size=2,stride=2),
nn.BatchNorm2d(512)
)
self.hidden = nn.Linear(2*2*512,1024)
self.drop = nn.Dropout(0.6)
self.dense1 = nn.Sequential(
nn.Linear(1024,256),
nn.ReLU(),
nn.Dropout(0.25)
)
self.dense2 = nn.Sequential(
nn.Linear(256,64),
nn.ReLU()
)
self.fc1 = nn.Linear(64,32)
self.fc = nn.Linear(32,num_classes)

def forward(self,x):
out = self.layer1(x)
out = self.layer2(out)
out = self.layer3(out)
out = self.layer4(out)
out = Flatten()(out)
out = nn.ReLU()(self.hidden(out))
out = self.drop(out)
out = self.dense1(out)
out = self.dense2(out)
out = nn.ReLU()(self.fc1(out))
out = self.fc(out)
return out

model = ConvNet(8).to(device)
removed = list(model.children())[:-1]
new_model= torch.nn.Sequential(*removed)
image_trans = torch.from_numpy(test_image.astype(np.float32)).to(device)
prediction = new_model(image_trans)
``````

Error:

``````Traceback (most recent call last):
File "test.py", line 121, in <module>
prediction = new_model(image_trans)
File "C:\ProgramData\Anaconda3\envs\deep\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "C:\ProgramData\Anaconda3\envs\deep\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
input = module(input)
File "C:\ProgramData\Anaconda3\envs\deep\lib\site-packages\torch\nn\modules\module.py", line 532, in __call__
result = self.forward(*input, **kwargs)*
File "C:\ProgramData\Anaconda3\envs\deep\lib\site-packages\torch\nn\modules\linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "C:\ProgramData\Anaconda3\envs\deep\lib\site-packages\torch\nn\functional.py", line 1372, in linear
output = input.matmul(weight.t())
RuntimeError: size mismatch, m1: [1024 x 2], m2: [2048 x 1024] at C:/w/1/s/tmp_conda_3.8_075429/conda/conda-bld/pytorch_1579852542185/work/aten/src\THC/generic/THCTensorMathBlas.cu:290
``````

Sorry for being not clear enough.

You wonâ€™t be able to copy the child modules directly to an `nn.Sequential` container, as all functional calls as well as locally defined modules in the original `forward` function will be missing.

Also, `removed` contains the order of modules as they were created in the `__init__`.
If this order is wrong, also the order of modules in the sequential container will be wrong.

To add the `Flatten` and `ReLU` modules into a sequential container, you could use an approach similar to this one:

``````new_model= torch.nn.Sequential(
*(list(removed[:4]) + [nn.Flatten(), nn.ReLU()] + list(removed[5:7]) + [nn.ReLU()] + list(removed[-2:-1])))
``````

Note, that the order is currently wrong and the model wonâ€™t work.

`nn.Sequential` is used for simple models and as you can see, manipulating the forward method is easier by writing a custom module and defining the `forward` manually.

1 Like

Thanks, i understand.
One more question, if forward function look like this:

``````   def forward(self,x):
out = self.layer1(x)
out = nn.Flatten()(out)
``````

And sequential container look like this:

``````nn.Sequential(layer[0] + [nn.Flatten()])
``````

or like this:

``````nn.Sequential( [nn.Flatten()] + layer[0])
``````

Which one is right ?
I mean, I have problems with the order of using the functional APIs in `nn.Sequential`

For a single layer, this approach would work.
However, note that `model.children()` returns the child modules in the order they were initialized in the `__init__`, which might not be the same order they are called in the `forward`.

I have a different question
I trained the model with batch_size = 8
However I want to test with one single image on the model obtained.
How do i achieve this without having to retrain the model with batch_size = 1?

You donâ€™t have to retrain the model, but just set it to evaluation mode via `model.eval()`.
This will make sure to e.g. disable dropout and use the estimated stats in batchnorm layers instead of the batch stats.

1 Like