Issue with AutoAugment when using half precision float (torch.float16)

Hi,
I’ve run into an issue when trying to use the torchvision.transforms.autoaugment.AutoAugment() module while having set the default d_type to float16 :
The minimum code to reproduce my error is the following :

import torch
import torchvision.transforms as transforms

torch.set_default_dtype(torch.float16)

image = torch.randn(10,3,32,32).cuda()

transform = transforms.autoaugment.AutoAugment()
transform = transform.cuda()

result = transform(image)

Giving the error :
"round_vml_cpu" not implemented for 'Half'

My understanding is that it’s a bug, AutoAugment() does not cast the tensor created in the _augmentation_space function to it’s device.

I’ve currently fixed it by replacing this function with one creating the Tensors on the GPU.

I’m posting this here to check if it is a bug / not yet implemented or if I’ve misunderstood how to use the module.

Thanks in advance.

The failing function is defined in this line of code and does not use the input tensor, but creates a new one based in the passed num_bits argument:

(8 - (torch.arange(num_bins) / ((num_bins - 1) / 4)).round().int(), False)

Since you have set the default type to float16 this newly created tensor will be created in the float16 dtype and the round operation will fail.

In my opinion using torch.set_default_dtype is dangerous as it has many side effects such as this issue.