I noticed that HalfTensor methods are only partially implemented.

Is there a plan to complete this implementation?

torch.**version** ‘1.0.1.post2’

I can create a float16 numpy array and convert it to torch tensor, but I cannot run .max() on the result unless I send it to gpu.

I can create a float16 cuda tensor but I cannot create the same tensor in cpu.

U understand that half tensor methods are specifically useful for GPU training, but I would have expected to be able to do CPU operatons on them as well. no?

Examples

```
f16 = np.random.random((1,3,24,24)).astype(np.float16)
tf16 = torch.from_numpy(f16)
print(tf16.dtype, tf16.shape) # OK
print(tf16.max())
# FAILS: RuntimeError: _th_max is not implemented for type torch.HalfTensor
```

```
ft16 = torch.zeros((1,3,16,16), dtype=torch.float16, device="cuda" )
print(ft16.dtype) # OK
```

```
ft16cpu = torch.zeros((1,3,16,16), dtype=torch.float16)
print(ft16cpu.dtype)
#FAILS: RuntimeError: _th_zero_ is not implemented for type torch.HalfTensor
```