The shape mismatch error seems to be raised in your linear
layer and based on the error message the in_features
value does not match the number of features of its input activation.
Assuming you are using 2048
samples each with a feature dimension of 2048
, using in_features=2048
in the linear
layer should work (and you should compare it to the Keras implementation to make sure the actual model architecture is the same).
1 Like