Is there any way to convolve a function channel-wise over a tensor?
I have a tensor of size u = torch.size([8,16,32,32]) = (N,C,H,W)
and trainable parameters:

mu = torch.Size([16])
sigma = torch.Size([16])

Batch-wise, to every channel in the tensor I want to apply the function:

I think you are asking how to apply your function element-wise to your
tensor u, taking into account that each of the 16 channels has its own
value of mu and sigma.

(To me, “convolve” implies that you have a sliding window that mixes
neighboring values in the tensor u together.)

If I understand properly what you want to do, you do not need to expand() any dimensions of mu and sigma. Instead you only need
to add singleton dimensions (“trivial” dimensions of length one that
don’t require any additional storage) to mu and sigma and let pytorch broadcast them over the non-channel dimensions of u.