Using half precision

I encounter similar problem, need to save memory.

I tried automatic mixed-precision training with autocast, but by directly plugging in the autocast, it doesn’t significantly reduce memory usage. I think it is due to that autocast works better with certain operations.

I also tried model.half().to(‘cuda’) which also doesn’t show significant memory save. Is it supposed to significantly reduce memory used by model?