As we all know, gpu can accelerate arithmetic operations of tensors, such as `+ - * /`

.

In that way, do tensors on gpu perform non-arithmetic operations faster than tensors on cpu?

For example, I want to change all elements of the tensor which are smaller than 10000 to 123:

```
a = torch.tensor([i for i in range(100000)])
a[(a<10000).nonzero(as_tuple=True)] = 123 # operation 1
```

```
a = torch.tensor([i for i in range(100000)]).cuda()
a[(a<10000).nonzero(as_tuple=True)] = 123 # operation 2
```

Is operation 2 necessarily faster than operation 1?