Hi all,
I did slight change on a nn by adding a max_pool1d and get error traceback

Traceback (most recent call last):
  File "../../learn/", line 212, in train
  File "C:\Users\Veid\Anaconda3\envs\caml\lib\site-packages\torch\", line 195, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "C:\Users\Veid\Anaconda3\envs\caml\lib\site-packages\torch\autograd\", line 99, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: fractional_max_pool2d_backward_out_cuda failed with error code 0

How can I correct this right?
I’m new with torch so I apologize in advance if this is a stupid question.

The original code snippet that works successfully

        #x shape is torch.Size([8, k, 400]) where k is an unfixed number
        #U.weight shape is torch.Size([50, 400])
        alpha = F.softmax(self.U.weight.matmul(x.transpose(1,2)), dim=2)
        #alpha shape is torch.Size([8, 50, k])
        m = alpha.matmul(x)
        #m shape is torch.Size([8, 50, 400])
        #final.weight shape is torch.Size([50, 400])
        y =
        #y shape is torch.Size([8, 50])

The code snippet after changing that fails to autograd

        #x shape is torch.Size([8, k, 400]) where k is an unfixed number, 8 is the batch size
        #U.weight shape is torch.Size([50, 400])
        x= F.max_pool1d(x.transpose(1,2), kernel_size=x.size()[1])
        #after max pooling, x shape is torch.Size([8, 400, 1])
        alpha = self.U.weight.mul(x.transpose(1,2)) 
        #alpha shape is torch.Size([8, 50, 400])
        #final.weight shape is torch.Size([50, 400])
        y =
        #y shape is torch.Size([8, 50])

Here is the runable code that I extracted related Variable defination and loss computation part. But the bug can not be reproduced in this
setting. :frowning:

import torch.nn as nn
import torch
import torch.nn.functional as F
from torch.nn.init import xavier_uniform_
from torch.autograd import Variable
U = nn.Linear(400, 50)
final = nn.Linear(400, 50)
x = Variable(torch.randn(8,123,400))
#The code snippet that works successfully
alpha = F.softmax(U.weight.matmul(x.transpose(1,2)), dim=2)
m = alpha.matmul(x)
y = final.weight.mul(m).sum(dim=2).add(final.bias)
#The code snippet that fails to autograd
x= F.max_pool1d(x.transpose(1,2), kernel_size=x.size()[1])
alpha = U.weight.mul(x.transpose(1,2)) 
y = final.weight.mul(alpha).sum(dim=2).add(final.bias)
yhat= y
loss = F.binary_cross_entropy_with_logits(yhat, target)

Thanks for the code snippet.
Since it’s not reproducible with the last code, could you try to use:

x= F.max_pool1d(x.transpose(1,2).contiguous(), kernel_size=x.size()[1])

and rerun the script, please?

Also, which PyTorch version are you using?
Make sure to update to the latest stable release, as Variables are deprecated since 0.4.

I think it’s coming from passing a non-contiguous tensor to the pooling layer, which was recently fixed, so it should work using the nightly binaries.

