First of all there is a problem with your input shape. The shape should be BATCH_SIZE * CHANNEL * HEIGHT *WIDTH. So lets correct your size and I assume you BATCH_SIZE = 36, CHANNEL = 3, HEIGHT = 200 , WIDTH = 150.
images = image.permute(0,3,1,2)
Next lets change your first Conv2d code. IT should be
torch.nn.Conv2d(3, 64, kernel_size=(3, 3))
So after the first convolution using your formular, we will have
[3, 64, 198, 148]
After the second Conv2d operation, we will have
[3, 128, 196, 146].
The maxpooling which halves the activations we will have
[3, 128, 98, 73]
And finally the input of the fully connected layer will be 128×98×73 = 915712