Hello,

I’m trying to implement convolution using matrix multiplication or something good approach.

I have spatial dependent kernel,

K dim=(H,W,S*S) eg., S=5 (5x5 convolution)

T dim=(H,W,C)

after convolution, as a result, I want to get,

R dim=(H,W,C)

currently, I use matrix multiplication in each point like this as numpy for test,

for y in range(h):

for x in range(w):

patch = get_patch(x,y) # return S*S (x,y) centered patch

R[y,x] = np.matmul(T[y,x], K[y,x])

but this approach uses CPU

I want to execute this on GPU using pytorch

Is there any way to implement this?