the value for padding is up to you. If you use a normal convolution without padding and stride=1, the output will be of the size output = (input - kernel) + 1 and your output gets smaller after the convolution (except kernel = 1). This is exactly what would happen in your first example:
To counteract this, padding can be applied. This way you add a border of some values (e.g. 0) around your input and increases its size by two times the padding value ( input = input + 2*padding).
If you want the output to be the same shape as the input, the formula would be padding = (kernel - 1) // 2.
Your second example( self.conv1 = nn.Conv2d(1, 16, 2, stride=2)), has a kernel_size = 2,
stride = 2 and padding = 0. In this case your kernel ‘moves’ 2 positions per convolution, which is exactly the kernel_size, resulting in halving the original input size, so that your output = input / 2