Is QAT Inference not support GPU?

I’m using pytorch 1.7.1 now.
Trying to use QAT for some project.
Everything works well on training.
But when I tried to inference on GPU,
Some errors appeared.

Is QAT inference not support GPU at this moment?
Will it be supported at later patch?

1 Like

Currently int8 kernels are not supported on GPU. The workaround is to either eval on GPU without converting to int8 (with fake_quant emulating int8), or to eval int8 on CPU. Adding int8 support on CUDA is being considered but there is no timeline at the moment.

1 Like