Is QAT Inference not support GPU?

I’m using pytorch 1.7.1 now.
Trying to use QAT for some project.
Everything works well on training.
But when I tried to inference on GPU,
Some errors appeared.

Is QAT inference not support GPU at this moment?
Will it be supported at later patch?

Currently int8 kernels are not supported on GPU. The workaround is to either eval on GPU without converting to int8 (with fake_quant emulating int8), or to eval int8 on CPU. Adding int8 support on CUDA is being considered but there is no timeline at the moment.