Cumsum_cuda_kernel determinism


This code works fine in PyTorch 1.11.0 and 1.12.0

import torch


t = torch.tensor(range(10), dtype=float)
t = t.cuda()
x = torch.cumsum(t, 0)

On PyTorch 1.13 it fails with error

RuntimeError: cumsum_cuda_kernel does not have a deterministic implementation, but you set 'torch.use_deterministic_algorithms(True)'. You can turn off determinism just for this operation, or you can use the 'warn_only=True' option, if that's acceptable for your application. You can also file an issue at to help us prioritize adding deterministic support for this operation.

I found an open issue about it Feature Request: deterministic CUDA cumsum · Issue #89492 · pytorch/pytorch · GitHub

We run our models with determinism and without it – both modes are used wildly on the same models.

I tried to write a simple workaround like this

def cumsum(*args: Any, **kwargs: Any) -> Any:
    # `cumsum_cuda_kernel` does not have a deterministic implementation
    # we turn off determinism enforcement just for this one operation, otherwise it error
    # in the deterministic mode.
    is_deterministic = torch.are_deterministic_algorithms_enabled()
    is_warn_only = False
        if is_deterministic:
            is_warn_only = torch.is_deterministic_algorithms_warn_only_enabled()
        result = torch.cumsum(*args, **kwargs)
        if is_deterministic:
            torch.use_deterministic_algorithms(True, warn_only=is_warn_only)
    return result

But then when we are trying to TorchScript the model we run into

E       RuntimeError: 
E       Unknown builtin op: aten::are_deterministic_algorithms_enabled.
E       Could not find any similar ops to aten::are_deterministic_algorithms_enabled. This op may not exist or may not be currently supported in TorchScript.

Is there any recommended way to go about it?

Also tracked here.