"Local" Conv2dTranpose with "depthwise"

I am trying to implement this type of layer that is somewhat complicated.

The idea is that unlike the Conv2Transpose, this layer has a different filter for each element of the entry. Additionally, this layer applies the same filter in all the depth dimensions, and returns the same number of depth (without mixing them, apply the transpose convolution filter by each h,w element).

That is, if we consider an NxM input. And we want to perform the LocalConv2dTransposeDepthWise with kernel_size K, we have NxM KxKx1 filters that perform the same transposed convolution of each channel of the input.

The truth is that I would not know where to start and if it can be implemented using existing pytorch structures.

In the following gif you can see the operation of the layer with a 2x2 entry with depth = 1 and 3x3 kernels:

If someone has any idea, it would be very helpful