The problem is because you’re doing it on CPU. Here’s a reply on a similar problem from another post on pytorch forums. It should work if you dua a .cuda() and move everything to GPU.
Not sure that I understand your answer. “On the CPU it’s just a dummy tensor for storage” does it mean that actually half CPU tensor stores data in float32, so it’s just fully dummy class without any useful logic at all?
HalfTensor on CPU is equivalent to the HalfTensor on the GPU, but it does not implement any mathematical operations. On the CPU, it only has copy and serialization operations. (half CPU Tensor doesn’t store data in float32, it stores it in float16)
ok, then my question is still valid: assume we have pretrained resnet18, we want to launch model at CPU for prediction, store all weights and intermediate data in float16 (do conversation to float32 when necessary) and do all computations in float32. Could it be done with pytorch? And if yes, how?