I don’t know if EfficientNet implementations use normalized/standardized inputs and if so, what the reason would be.
Did you check some reference implementations (maybe it’s mentioned in the code) or the paper?
thank you.
excuse me. Another question.
Is it possible to add a convolution layer in transfer learning?
When using transfer learning, I added a torch.nn.Conv2d
layer in the last layer, but I got this error:
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [512, 1280, 1, 1], but got 2-dimensional input of size [8, 1280] instead
How can I fix this error when I do not have access to self.forward
?
Thank you so much
Usually you would create a model object (e.g. via model = MyModel()
) and would thus need access to the source code of the model.
However, if you cannot access it for some reason, you could add an nn.Unflatten
layer in front of the new conv layer so that the inputs are 4-dimensional again.
Thank you so much
Hello sir. Good time
excuse me.
Does using nn.AdaptiveAvgPool2d((1,1))
make sense?
In this case, is the filter size equal to the input size?
Why is this used?
What happens in this case?
Adaptive pooling layers can be used to create a defined output shape, which could allow your model to work with variable input shapes. E.g torchvision
models use adaptive pooling layers after the feature extractor and before feeding the activation to the first linear layer to allow different input shapes.
Thank you very much
What does output_size= (1,1) mean?
And is it possible to use the pooling layer between fully connected layers?
I used the classifier layer as below.
Linear(1280,512), unFlatten(), AdaptiveAvgPool2d((1,1)), Flatten(), Dropout(), Linear(512,256), unFlatten() , AdaptiveAvgPool2d((1,1)), Flatten(), Dropout(), Linear(256,6))
Is having an Pooling layer different with not having it here?
The output_size
defines the spatial size of the output activation of this layer as seen here:
pool = nn.AdaptiveAvgPool2d(output_size=(1,1))
x = torch.randn(2, 3, 24, 24)
out = pool(x)
print(out.shape)
> torch.Size([2, 3, 1, 1])
x = torch.randn(2, 6, 2, 2)
out = pool(x)
print(out.shape)
> torch.Size([2, 6, 1, 1])
You can thus pass tensors with different input shapes to this layer and will get the defined spatial output size.
Yes, that’s possible.
Assuming the first linear layer creates a 2D activation in the shape [batch_size, 512]
, the Unflatten
and AdaptiveAvgPool2d
layer won’t do anything, since the spatial shape would already be 1x1
as seen here:
model = nn.Sequential(
nn.AdaptiveAvgPool2d((1,1)),
nn.Flatten(),
nn.Linear(512, 256)
)
x = torch.randn(2, 512, 1, 1)
out1 = model(x)
out2 = model[2](x.view(x.size(0), -1))
print((out1 - out2).abs().max())
>> tensor(0., grad_fn=<MaxBackward1>)
Thank you very very much.God reward you.
In the EfficientNet model, the final convolution layer is as follows.
(_conv_head): Conv2dStaticSamePadding(
320, 1280, kernel_size=(1, 1), stride=(1, 1), bias=False
(static_padding): Identity()
)
Is there anything like that on torch.nn
? Can torch.nn.Conv2d
be used instead?
I just want to change 1280!
I don’t know exactly what Conv2dStaticSamePadding
does, but based on this comment it seems to be used to export the model so I guess you should be able to replace it with an nn.Conv2d
layer.
Thank you so much