How is Conv2d interpreting batch_size>1?

I’m not sure I understand your use case completely, but you could most likely use groups=2 which might avoid the loop.
From the docs:

At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels and producing half the output channels, and both subsequently concatenated.

Also, this post gives you a visualization how grouped convolutions are applied.