What is the matrix used to represent the kernel for ConvTranspose2d?

I was wondering what would be the matrix that is used for ConvTranspose2d?

I understand that the matrix for Conv2d given the initialization below:


conv = nn.Conv2d(1, 1, 2, stride=1)
conv.bias = nn.Parameter(torch.from_numpy(np.array([0]).astype(np.float32)))
conv.weight = nn.Parameter(torch.from_numpy(np.array([[[[4, 2],
                                                     [0, 1]]]]).astype(np.float32)))

inp = Variable(torch.from_numpy(np.array([[[[1, 1, 3],
                                            [2, 3, 1],
                                            [4, 0, 4]]]]).astype(np.float32)))

The input and kernel would take this form

inp_flat = np.array([1, 1, 3, 2, 3, 1, 4, 0, 4])
kernel = np.array([[4, 0, 0, 0],
                   [2, 4, 0, 0],
                   [0, 2, 0, 0],
                   [0, 0, 4, 0],
                   [1, 0, 2, 4],
                   [0, 1, 0, 2],
                   [0, 0, 0, 0],
                   [0, 0, 1, 0],
                   [0, 0, 0, 1]])

np.reshape(inp_flat.dot(kernel), (2, 2))

array([[ 9, 11],
       [14, 18]])

Which gives the same answer as:

conv(inp)

Variable containing:
(0 ,0 ,.,.) = 
   9  11
  14  18
[torch.FloatTensor of size 1x1x2x2]

The part that I don’t understand is when I use ConvTranspose2d:

conv_t = nn.ConvTranspose2d(1, 1, 2, stride=1)
conv_t.bias = nn.Parameter(torch.from_numpy(np.array([0]).astype(np.float32)))
conv_t.weight = nn.Parameter(torch.from_numpy(np.array([[[[4, 2],
                                                     [0, 1]]]]).astype(np.float32)))

With the same input:

inp = Variable(torch.from_numpy(np.array([[[[1, 1, 3],
                                            [2, 3, 1],
                                            [4, 0, 4]]]]).astype(np.float32)))

I get the result:

conv_t(inp)

Variable containing:
(0 ,0 ,.,.) = 
   4   6  14   6
   8  17  11   5
  16  10  19   9
   0   4   0   4
[torch.FloatTensor of size 1x1x4x4]

Can someone enlighten me as to what the kernel would look like? The closeting thing I have found is section 4.2 in this paper which says that the transpose convolution is simply the transpose of the kernel defined above, but it doesn’t make sense to me as the matrix should have one dimension that has 16 but the 2 dimensions in kernel defined above only have values 4 and 9.