How to specify filters in conv2d?

In tensorflow, API is desribed asL

TensorFlow

tf.keras.layers.Conv2D | TensorFlow Core v2.6.0

2D convolution layer (e.g. spatial convolution over images).

tf.keras.layers.Conv2D(    filters, kernel_size, strides=(1, 1), padding='valid',    data_format=None, dilation_rate=(1, 1), groups=1, activation=None,    use_bias=True, kernel_initializer='glorot_uniform',    bias_initializer='zeros', kernel_regularizer=None,    bias_regularizer=None, activity_regularizer=None, kernel_constraint=None,    bias_constraint=None, **kwargs)

followed by some example:

# The inputs are 28x28 RGB images with `channels_last` and the batch
# size is 4.
input_shape = (4, 28, 28, 3)
x = tf.random.normal(input_shape)
y = tf.keras.layers.Conv2D(
2, 3, activation='relu', input_shape=input_shape[1:])(x)
print(y.shape)


Secondly, I am also "porting" doing pytorch equivalent but pytorch's conv2d API has no mentions of filters, only significant params are: channels in / out and kernel size:

https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html#torch.nn.Dropout

So how one specifies the filter in pytorch?

out_channels are filters. The in_channels should be the previous layers out_channels. But if you are on the first Conv2d layer, the in_channels are 3 for rgb or 1 for grayscale.

https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html

Thank you, I will try this out.

I changed the code as follows and it seems to work better by going further:

    def __init__(self):
        super(MLP, self).__init__()
        #self.flatten = nn.Flatten(1, 3)
        self.conv1 = Conv2d(1, 64, 7)
        self.act1 = ReLU()
        self.maxpool1 = MaxPool2d(2)

        self.conv2a = Conv2d(64, 128, 3)
        self.act2a = ReLU()
        self.conv2b = Conv2d(128, 128, 3)
        self.act2b = ReLU()
        self.maxpool2 = MaxPool2d(2)

        self.conv3a = Conv2d(128, 256, 3)
        self.act3a = ReLU()
        self.conv3b = Conv2d(256, 256, 3)
        self.act3b = ReLU()
        self.maxpool3= MaxPool2d(2)

        self.flatten = nn.Flatten(1, 3)
        # not sure on 784
        self.hidden1 = Linear(784, 128)
        self.drop1 = Dropout()
        self.hidden2 = Linear(128, 64)
        self.drop2 = Dropout()
        self.hidden2 = Linear(64, 10)
        self.act7 = Softmax()

Now I am getting this:

Traceback (most recent call last):
  File "p461.py", line 258, in <module>
    train_model(train_dl, model)
  File "p461.py", line 200, in train_model
    yhat = model(inputs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "p461.py", line 146, in forward
    X = self.conv3b(X)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 443, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 440, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Calculated padded input size per channel: (1 x 1). Kernel size: (3 x 3). Kernel size can't be greater than actual input size

Offending line:
self.conv3b = Conv2d(256, 256, 3)
from snippet:

       self.conv3a = Conv2d(128, 256, 3)
        self.act3a = ReLU()
        self.conv3b = Conv2d(256, 256, 3)
        self.act3b = ReLU()
        self.maxpool3= MaxPool2d(2)

Is this a sequential model? Please show your forward definition.

When building a CNN, you have to make sure you know what size you’re getting out of each part. You can calculate the sizes by looking at the formula on the bottom of the documentation page for each type of module(i.e. Conv2d, MaxPool2d, etc.).

Or you can just add in a bunch of print statements at each step in your forward pass in order to understand the size of the tensor moving thru.

x=self.relu(self.conv1(x))
print(x.size())
x=pool(x)
print(x.size())
...

Once you know what the size is, you might decide to remove some layers or add some padding to accommodate. If the height and width are (1, 1) any further reduction layers will give errors.