I recently read about ONNX runtime and it might be an alternative.
PyTorch could export models to ONNX format and run the model with ONNX runtime.
https://pytorch.org/docs/stable/onnx.html
According to the onnxruntime, NNAPI is supported including CPU and GPU inference.
I didn’t try it but looks like it is the only Android gpu-enabled accelerator for now.