RuntimeError: diff_view_meta->output_nr_ == 0 only happen in GPU

Boazrciasn · July 31, 2019, 8:43pm

Hello the following code snippet gives the error in the title. Snippet works on cpu but fails on 2 GPU environment, I haven’t checked with a single GPU. Code particularly fails on w_theta_sin line.

def forward(self, x, a):
    with torch.no_grad():
        self.W_hat.div_(torch.norm(self.W_hat, dim=5, keepdim=True))
        self.W_theta.fmod_(math.pi)
    W_theta_sin = torch.sin(self.W_theta) + eps

RuntimeError: diff_view_meta->output_nr_ == 0 ASSERT FAILED at /opt/conda/conda-bld/pytorch_1556653215914/work/torch/csrc/autograd/variable.cpp:209, please report a bug to PyTorch.

Searched the same error, I know that there is an open issue, but I did not understand much of it tbh:) is there a workaround for this?

Thanks in advance.

ptrblck · August 2, 2019, 6:52pm

Are you using nn.DataParallel?
If so, could you post the shapes you’ve used (parameters and inputs) to raise this error?

Boazrciasn · August 5, 2019, 2:35pm

Yes I am using nn.DataParallel , sorry for sharing small snippets, it is a fairly large script and my research.

self.W_theta = nn.Parameter(torch.rand(1, 1, 1, in_channels, out_channels, 1, 1, 1))
self.W_hat = nn.Parameter(torch.rand(1, 1, 1, in_channels, out_channels, 3, 1, 1))

Input is a custom dataset with 2 channels. Everything is fine except this section I have done many experiments before adding this part.

ptrblck · August 8, 2019, 11:31pm

Were you able to isolate this issue further or do you have a reproducible code snippet or is this issue reproducible using just the provided information regarding W_theta and W_hat?

Also, could you share the shapes of x and a?

Boazrciasn · August 9, 2019, 8:18am

I had to drop the reassignment in order to avoid the error in my further experiments. I am sure that error happens at the following lines.

with torch.no_grad():
    self.W_hat.div_(torch.norm(self.W_hat, dim=5, keepdim=True))
    self.W_theta.fmod_(math.pi)

I will try to write a small snippet for reproduction in coming weeks, I only have one computer with gpu and that is working atm Sharing a and x would not help you, there are several steps before I use w_theta_sin I need to simplify it to make it easier for you to trace it.

I think this issue is also related, but I am using 1.1.0:

Thanks for asking,