Let’s suppose to the Input image data size is 1x10 (WxH) in CNN.
And we have two kernel size.
One is 1x3 and the other one is 3x3.
With zero-padding, to get output size is as same as input size 1x10.
Then, the both performance must be no different?

You’ll see that the result does not depend on the 0th and 2nd slice of the 3x3 kernel, so effectively, the result is the same as if you use the 1x3 kernel given by w[:, :, 1, :].

@tom
I understood.
Then what if input size is (32x32)x1 (W x H) x N and kernel size is 1x3
compare to input size is (1x32)x32 and same kernel size 1x3.

In this case too, the performance would be similar??

Sigh. Is that even a PyTorch question - it has TF-style WHN layout (and no C)?
I’d recommend checking out one of the excellenttutorials on how convolution works and think about what values in the kernel and in your tensor interact.