Hi,

I am performing a series of operations on large 1D torch tensors of the same shape to get several feature vectors. The feature vectors are then multiplied to get the desired result. I am utilizing GPU. Example is:

```
# large 1D torch tensors
a = torch.tensor([0.000001, 0.1, 1, 100, 1000]).to("cuda")
b = torch.tensor([0.000002, 0.2, 2, 200, 2000]).to("cuda")
# 1st feature computation
feature_a = a**2
# 2nd feature computation
feature_b = 1/(b**4)
# 3rd feature computation
feature_c = ...
# multiplying feature vectors to get the final result
result = torch.prod([feature_a, feature_b, feature_c], dim=0)
```

The thing is, after computation of `feature_a`

, I can already see, that the first element of `feature_a`

will be very close to 0, which in turn will cause the first element of `result`

to be very close to 0.

Is there some way to look for the small values after the calculation of each feature and then to tell PyTorch that it can avoid the computation of these obvious elements in subsequent features and only spend time on the other elements?

The question is not just how to do it but if it is worth doing. The aim is to speed the calculations up, so if some indexing functions would delay the otherwise smooth GPU operations, I will probably stick to the current procedure.

Thanks!

NOTE: I am not interested in the gradients, only the forward pass.