Hi,

I’m trying to do conv2d with non-negative values , but getting negative values in the output.

z`1 = F.conv2d((input * input), sigma_square, None, self.stride, self.padding, self.dilation, self.groups)`

The input pixels are x*x, and the weights are the variance which is non negative by definition (I also made sure it’s non-negative here) .

My guess it relates to numerical error because the values are very close to zero (10e-5), but I’m not sure.

Any idea how I can solve it? (I have sqrt(z1) after the conv2d, so it can’t be negative).

Could you try to create random tensors, which would reproduce the issue, so that we could take a look at it, please?

1 Like

I reproduced the case:

GitHub - guyber9/repo

You can find in the link 2 files:

my_tensors.pt - which includes ‘x’ and ‘w’ tensors for the convolution
main.py - read the tensor files and doing the conv2d.
Just run main.py and you’ll see the input is all positive but conv2d results include negative values.

Thanks for the code snippet and the inputs. Since I cannot reproduce the issue, we would need more information about your setup (PyTorch version, GPU, CUDA, cudnn etc.).
Output:

```
x is negative: tensor(False, device='cuda:0')
w is negative: tensor(False, device='cuda:0')
z (= Wx) is negative: tensor(False, device='cuda:0')
v is negative: tensor(False, device='cuda:0')
v isnan: tensor(False, device='cuda:0')
```

1 Like

PyTorch version: 1.9.0+cu102
NVIDIA GeForce RTX 2080 Ti
CUDA Version 10.1.105

I’ve found something interesting. If I’m adding:

torch.backends.cudnn.deterministic = True

It solves the issue.

But when running with:

cudnn.benchmark = True

It’s bringing back the problem (even torch.backends.cudnn.deterministic = True).

1 Like

Hi, I’m encountering the exact same problem.

My environment is:
PyTorch 1.9.1+cu102
GeForce 2080 Ti
CUDA 10.2
Win7

Your advice seems to work, and it is possibly related to FFT & Winograd implementations in cudnn.

opened 04:11PM - 07 Dec 19 UTC

module: docs
module: convolution
triaged

The gist of the issue is that at one point in my network in my project I feed po… sitive float32 inputs `X` and `W` into a 1d (or 2d) convolution, but unexpectedly obtain explicitly negative outputs.
## Details
The X input is nonnegative about 1e-2 with many explicit 0 (after an elementwise square, or abs), and the W input is positive but with quite large range of [1e–8, 1e+8] (after an exp). I take `F.conv2d(X, W, None)` and **expect nonnegative** outputs (finite or infinite), but actually get **significantly negative** values. The downstream computations rely on the result of this operation to be at least nonnegative.
I tried `torch.backends.cudnn.deterministic = True`, or fp64, or clamping the `W` value and each time the problem seemed to had gone away. In my project I ended up clamping the result of the convolution to nonnegative range, and it sufficed.
Having observed this oddity got me worried, since in my understanding of floating point arithmetic the loss of precision and round off could yield imprecise results, but not **flip the sign** of the overall result. I read in the docs, that non-deterministic cudnn algorithms might induce this behaviour and yield inconsistent results, but I did not expect that atomic additions (according to cudnn reference) could spuriously flip the sign bit of a 32bit float.
I understand that Github issues are not a Q&A support forum, but I think that this inconsistency, might of interest, and would very much appreciate any discussion of guarantees of single precision operations.
## Minimally reproducing example
I came up with a minimal example of this odd behaviour. My observations are as follows (based on this snippet, and on the real data within the network in my project):
* negative outputs of various magnitude occur **only if** deterministic is *False*, and data is *fp32*, and the input `X` is a moderately large-dim tensor.
* `X` and `W` of dims `...x12x12` and `...x4x4`, respectively, give a lot of very small fp32 numbers, some of which are negative
* `X` and `W` of dims `...x11x11` and `...x3x3`, respectively, produce very rare negative tiny fp32, but mostly expected zeros
These suggest that some optimizations within cudnn might be responsible.
The following code constructs explicitly nonnegative `X` and `W`: `X` all zeros except for a single **1** at the *centre*, `W` is a 4x4 kernel with constant **1e-3**. I attach the pickled [result.gz](https://github.com/pytorch/pytorch/files/3935238/result.gz)
```python
import gzip
import torch
import torch.nn.functional as F
# change these
torch.backends.cudnn.deterministic = False
devtype = dict(device=torch.device("cuda:0"), dtype=torch.float32)
# x -- zeros with a spike
x = torch.zeros(1024, 1, 12, 12).to(**devtype)
x[..., 5, 5] = 1.
# small nonnegative values
w = (torch.ones(1, 1, 4, 4) * 1e-3).to(**devtype)
assert x.ge(0).all() and w.ge(0).all()
# conv1d/2d
result = F.conv2d(x, w, None)
with gzip.open("result.gz", "wb") as fout:
torch.save(dict(x=x, w=w, result=result), fout)
assert result.ge(0).all()
```
To unpack the attached `.gz` pickled result.
```python
import gzip
import torch
with gzip.open("result.gz", "rb") as fin:
pack = torch.load(fin)
pack["result]
```
Here is a link to a [colab notebook](https://colab.research.google.com/drive/136n8oKDitchxEkXMhV7AJeYfFUayiS4X), where it can be seen, that the output is spurious even for `X` is `25 x 1 x 12 x 12`. Especially striking is that `24 x 1 x 12 x 12` appears to be correct.
## Specs and versions
Below are very abridged specs:
`Intel(R) Xeon(R) CPU E5-2698 v4 @ 2.20GHz, 256GB RAM`, `4x GeForce GTX 1080 Ti` with `Driver Version: 390.77` on `Linux Mint 18.3 Sylvia (GNU/Linux 4.10.0-38-generic x86_64)` and conda `python 3.7.4` packages:
* `pytorch 1.1.0 py3.7_cuda9.0.176_cudnn7.5.1_0 pytorch`
* `cudatoolkit 9.0 h13b8566_0`
cc @jlin27 @mruberry

Thank you!