I got Titan V and have been experimenting with half-precision.

In half-precision mode I can’t backpropagate through matmul of two all-zeros matrices, because the *number of elements* in the resulting matrix is outside of half-precision range.

I am getting the same error if I use Conv1d or Conv2d or bmm.

This minimal computation graph replicates the problem:

```
import torch, torch.autograd, torch.nn,numpy
with torch.cuda.device(0):
test_input = torch.autograd.Variable(torch.zeros(257, 509)).cuda().half()
test_w = torch.nn.Parameter(torch.zeros(509,263)).cuda().half()
matmul_result = torch.matmul(test_input, test_w)
print(matmul_result.size())
print(numpy.prod(matmul_result.size()))
test_output = matmul_result.abs().mean()
test_output.backward()
```

And the result is:

```
torch.Size([257, 263])
67591
Traceback (most recent call last):
File "<stdin>", line 11, in <module>
File "/home/dzmitry/miniconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/home/dzmitry/miniconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 99, in backward
variables, grad_variables, retain_graph)
RuntimeError: value cannot be converted to type Half without overflow: 67591
```

I am using PyTorch 0.3 and could replicate the issue with CUDA 8 and CUDA 9.