Hi,
I have a Linear/GRU NN implementation that will eventually be in an embedded system. As such I need the inference to be done with float16. I tried performing the test with model.half() but I got the following error:
*** RuntimeError: “compute_indices_weights_linear” not implemented for ‘Half’
I read some discussions in this regard pointing out that float16() is not supported in a CPU as it wouldn’t much speedup. However in my case I want to test with float16 to make sure there is no performance degradation when I port my network to embedded environment with all float16. Is there any way for me to do this other than moving to a CUDA for my inference test?
Thanks
Yohannes