Run:

`torch.round(torch.tensor([-0.6, -0.5, -0.4, 0.4, 0.5, 0.6]))`

Got:

`tensor([-1., -0., -0., 0., 0., 1.])`

.

Expect:

`tensor([-1., -1., -0., 0., 1., 1.])`

.

I try:

```
x = torch.tensor([-0.6, -0.5, -0.4, 0.4, 0.5, 0.6])
x[x > 0] = torch.floor(x[x > 0] + 0.5)
x[x < 0] = torch.ceil(x[x < 0] - 0.5)
```

But it is too slow.

Run this code on 2080TI,

```
import torch
import time
x = torch.rand(3, 64, 128, 128).float() * 10 - 5
x = x.cuda()
tic = time.time()
x[x > 0] = torch.floor(x[x > 0] + 0.5)
x[x < 0] = torch.ceil(x[x < 0] - 0.5)
toc = time.time()
print(toc - tic) # 0.008876 (if use torch.round, it will be 0.0006)
```

So, is there any faster way to rounding a tensor in my way(`0.5 -> 1`

and `-0.5 -> -1`

)?