Yes, a matmul approach will be used for the native implementations as given here. Different backends (e.g. cuDNN) could call different algorithms internally, which depend on the workload shape etc.
I found this article helpful in detailing the underworkings of the Conv2d operation:
So the kernel is first applied via matmul with the set stride, dilation, etc to each channel. Then those output channels are summed across channels. That is repeated for each kernel.
In Pytorch, the weights are of size:
out_channels, in_channels, kernel_dim0, kernel_dim1