RunTime Error: Function AddmmBackward returned an invalid gradient at index 1 (Mismatching shapes)

akshaybharadwaj · August 4, 2021, 5:51am

Hey All,

I am trying to implement a VAE, and i am having trouble calculating the gradient for the model. I believe this is happening in the decoder. The exact error message is Function AddmmBackward returned an invalid gradient at index 1 - got [10, 32] but expected shape compatible with [10, 1024]. Here is the decoder model.

class decoderNW(nn.Module):
    def __init__(self):
        super(decoderNW,self).__init__()

        channels = 32
        kernelSize = 4
        padding = (2,0)
        stride = (2,2)
        outputpadding = (1,0)

        self.FC1 = nn.Linear(channels, 1024)

        self.FC2 = nn.Linear(channels, 10656)

        self.deConv3x301 = nn.ConvTranspose2d(channels, 64, kernel_size=kernelSize, stride=stride, output_padding=outputpadding)
        nn.init.xavier_uniform_(self.deConv3x301.weight)

        self.deConv3x302 = nn.ConvTranspose2d(64, 128, kernel_size=kernelSize, stride=stride, output_padding=outputpadding)
        nn.init.xavier_uniform_(self.deConv3x302.weight)

        self.deConv3x303 = nn.ConvTranspose2d(128, 64, kernel_size=kernelSize, stride=stride, output_padding=outputpadding)
        nn.init.xavier_uniform_(self.deConv3x303.weight)

        self.deConv3x304 = nn.ConvTranspose2d(64, 3, kernel_size=kernelSize, stride=stride)
        nn.init.xavier_uniform_(self.deConv3x304.weight)

        self.bn1 = nn.BatchNorm1d(1024)
        self.bn2 = nn.BatchNorm2d(64)
        self.bn3 = nn.BatchNorm2d(128)
        self.bn4 = nn.BatchNorm2d(64)
 


        self.ReLU = nn.ReLU(inplace=True)

        self.sigmoid = nn.Sigmoid()

    def forward(self,x):

        x = self.FC1(x)
        x = self.bn1(x)
        x = self.ReLU(x)

        x = self.FC2(x)

        # Reshape x as 8x42x75
        x = x.view(x.size(0),32,9,37)

        x = self.deConv3x301(x)
        x = self.bn2(x)
        x = self.ReLU(x)

        x = self.deConv3x302(x)
        x = self.bn3(x)
        x = self.ReLU(x)

        x = self.deConv3x303(x)
        x = self.bn4(x)
        x = self.ReLU(x)

        x = self.deConv3x304(x)
        x = self.sigmoid(x)

        return(x)

I believe its happening when I am trying to reshape the tensor into a 2D tensor (like image) from FC to deconv layer.

I have tried using reshape function, but the same problem persists. Im not sure where I am going wrong. Any help is greatly appreciated.

Thanks.

PS: I get this error when I run backward(). Here is the code snippet for that!

            optimizerVAE.zero_grad()
            variationalAE.train()

            vaeT = vaeT.to('cuda')

            mu, sigma, xHat, z = variationalAE(srcClrT)

            loss = vaeLoss(srcClrT, mu, sigma, xHat, z)

            loss.backward()

ptrblck · August 10, 2021, 4:30am

I cannot reproduce the issue using:

model = decoderNW()
x = torch.randn(2, 32)
out = model(x)
out.mean().backward()

but also needed to modify your model in:

        self.FC1 = nn.Linear(channels, 1024)

        self.FC2 = nn.Linear(1024, 10656)

as I didn’t know the input shapes for the desired reshaping.
Could you update the code with random inputs so that we could execute it and reproduce the issue?

whoab · June 1, 2022, 4:07am

I got this error because the noise vector I was passing into my encoder had the wrong latent dimension size. Check that.