RuntimeError: matrices expected, got 4D, 2D tensors at /b/wheel/pytorch-src/torch/lib/TH/generic/THTensorMath.c:1232

Where, if I do not convert the model to sequential model it works perfectly. Now, I have seen some existing posts with these kind of errors on forum being solved with .view however, I do not know what exactly is going on and why do we need to call .view and what should be the arguments of it?

Hmmm, before thinking about conceputal things, is it sure that .children() will return the children in the same order that you added them as object attributes? (Iâm not saying it isnt, just posing the question, not sure how such a guarantee would be implemented though in fact?)

Why not just leave the model as-is, and monkey patch in a new forward method?

def truncated_forward(self, x):
out = F.relu(self.conv1(x))
out = F.max_pool2d(out, 2)
out = F.relu(self.conv2(out))
out = F.max_pool2d(out, 2)
out = out.view(out.size(0), -1)
out = F.relu(self.fc1(out))
out = F.relu(self.fc2(out))
return out
model.forward = truncated_forward

I havent tried this, but see no obvious reason why this wouldnt work?

(for that matter, are you loading this from Pickle? In that case, the pickle only stores the data, doesnt store the actual method implementaitons. So, if you modify the LeNet class, to remove teh fc3 bit, only from forward, I think all should work ok?)

Itâs heavily used/useable in python. It wont really work in c++. In Python, pretty much anything can be modified/monkey-patched at runtime, which is kind of nice (though dangerousâŚ)

Just wanted to add a tip on your model for efficiency. In the particular case of using Relu and maxpool you can actually reorder as doing the maxpool2D on conv2d output and then apply the Relu activation. Reason being that order of operation doesnât matter in this particular case as they mathematically compute to be equal. So in forward def put order of operations as:

out =F.relu(F.max_pool2d(self.conv1(x), 2))

When done in this order it requires a substantial fewer number of computations and obviously speeds training up as benefit while result is still the same