Custom Convolution Dot Product

lucamocerino · March 15, 2018, 9:59pm

Hi,
my question is quite trivial. Is it possible to change the conv2d Functional ?
I need a very deep but small change in the dot product in convolution is it possible ?
Thanks every one.

jpeg729 · March 15, 2018, 10:05pm

Well you could write a custom autograd.Function as explained here. That way you could code your custom conv2d using numpy operations, but you would have to code the backward pass explicitly.

Or you could explain what you want in a little more detail and maybe someone can show you how to accomplish it with standard tensor operations and a little ingenuity.

lucamocerino · March 15, 2018, 10:08pm

Thank for the suggestion.
But I need to replace the inner matrix multiplication in conv (and also in fc) with a different non math operation. I need to replace the elementary multiplication (i.e. in XNOR net substitute a multiplication with a bitwise xnor ) (stupid example, i know).

Shen · July 14, 2018, 10:01pm

Hi, did you solve this problem? I’m facing a similar problem right now…

lucamocerino · July 15, 2018, 11:37am

The only solution is, as previously suggested, to build a custom functional and then a module. The real issue is the speed. A fully python implementation of a custom convolution is really slower. In order to have a reasonable speed you have to modify the C backend used in Pytorch. This kind of hack is not trivial at all !!!
Good luck

SimonW · July 15, 2018, 1:08pm

@Shen @lucamocerino If we are talking about images, another approach is to unfold the tensor into columns (with nn.Unfold), do whatever your operation is, then view it as the output shape.

Unfold (i.e., im2col) + gemm (i.e., batched matrix multiplication) + view is actually a common and pretty efficient implementation of convolution.

hughperkins · July 15, 2018, 2:04pm

Example of how to use unfold here:

"""
test using torch unfold to do a convolution

we'll do a convolution both using standard conv, and unfolding it and matrix mul,
and try to get the same answer
"""
import torch
from torch import nn, optim
import torch.nn.functional as F

def run():
    in_channels = 2
    out_channels = 5
    size = 4
    torch.manual_seed(123)
    X = torch.rand(1, in_channels, size, size)
    conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=3, padding=1, bias=False)
    out = conv(X)
    print('out', out)
    print('out.size()', out.size())
    print('')

    Xunfold = F.unfold(X, kernel_size=3, padding=1)
    print('X.size()', X.size())
    print('Xunfold.size()', Xunfold.size())

    kernels_flat = conv.weight.data.view(out_channels, -1)
    print('kernels_flat.size()', kernels_flat.size())

    res = kernels_flat @ Xunfold
    res = res.view(1, out_channels, size, size)
    print('res', res)
    print('res.size()', res.size())


run()

hughperkins · July 15, 2018, 2:05pm

(also a blog post on how this works: https://petewarden.com/2015/04/20/why-gemm-is-at-the-heart-of-deep-learning/ )

lucamocerino · July 15, 2018, 3:45pm

@hughperkins Is it possible the same thing with Pytorch 0.3 ? (i cannot upgrade the version)
If yes can u show me an example. Thank you.

hughperkins · July 15, 2018, 3:55pm

I think unfold is new in 0.4.

The actual unfold function itself is present in torch for years. So, you could use FFI to call it. It’s not that hard to use FFI from pytorch AFAIK, but it’s not something I can describe in a few lines, partly because I’d have to go away and google, search for pytorch forums posts on it. I’ve done FFI from python before though, and it’s fairly painless, if you have a few days to kill.

On the whole, if upgrading to 0.4 will take less than ~few hours effort, I’d go for the 0.4 solution probably.

(note that ‘unfold’ used to be called ‘im2col’ as Simon alludes to)

Shen · July 17, 2018, 4:03am

Your example and link help me a lot!! Thanks! I’m quite new to this field, and still have one question about the bias part. From what I saw, bias term is a vector (i.e. 1D tensor) with the size of output channels. How to add this bias term into the unfold version of the convolution?

hughperkins · July 17, 2018, 10:28am

Create a 1d tensor, with requires_grad=True. Use broadcasting to add it to the result of the matrix multiplication.

Shen · July 17, 2018, 3:00pm

I tried this way before, but I cannot get the broadcasting work.

Say my multiplication result has the size [32, 6, 28, 28], and the bias term is a 1D tensor with the size 6. Broadcasting these two together will result in an error saying
"The expanded size of the tensor (28) must match the existing size (6) at non-singleton dimension 3".
Is there anything wrong with my implementation?

hughperkins · July 17, 2018, 11:00pm

you need to unsqueeze the last dimension of your tensor twice. concretely, your bias tensor needs to have the dimensions [6, 1, 1]

Shen · July 18, 2018, 3:23pm

Got it! Thanks for your help!

Tycho_van_der_Oudera · August 29, 2018, 11:14am

I am interested in the other way around: unfolding the kernel weight matrix (instead of X) and then multiplying it with a flattened X to get the same result. But, I have quite some difficulties getting the dimensions right.

Would this be possible?

(I understand that the matrices will become a lot larger this way)

hughperkins · August 29, 2018, 11:31am

@Tycho_van_der_Oudera Unclear to me what you are asking. The GEMM way of doing convolution involves ‘flattening out’ both the spatial input tensor and the spatial kernels. However, for the kernels, there’s no need to do any ‘unrolling’ as such, a pytorch .view is sufficient. (You can check my code above for an example).

himat · March 28, 2019, 3:18pm

Why do you have unfold the input, but can just .view the kernel?

Shangyin_Gao · May 14, 2019, 8:27am

if you want to use matrix multiplication to calculate convolution, the input matrix needs to be ‘duplicated’. But the kernel not.

unfold do the duplication

Sami_Hassan · December 3, 2019, 3:53am

Thanks for your code,

what is the @ part res = kernels_flat @ Xunfold

Can i replace the multiplication and addition with my own mymult(num1,num2) and my add(num1,num2) with this operation ?