Hey guys,
EDIT: SEE BELOW
I’m working on using FP16 in DenseNets where concatenation is a key part of the architecture, but since HalfTensors don’t support stateless methods I can’t call cat().
I’ve thrown a workaround in by instantiating a temporary variable to store the concatenated results and then just dumping to it by indexing, but then I get an error calling backwards() that looks like it comes from the final layer:
torch.mm(grad_output, weight)
RuntimeError: Type HalfTensor doesn’t implement stateless methods
Is there an easy way to work around this, or is training in half precision not supported yet?
I’m on CUDA8.0, cuDNN5105, Ubuntu14.04, pytorch1.9 (the latest version that you get through conda install) and a Maxwell titan X. I’m in a situation where memory costs are more important than speed so I can eat the slowdown of FP16 if it reduces memory useage.
Aside, I get an error when calling torch.backends.cudnn.version() if I don’t call torch.backends.cudnn._loadlib() first (the “cuDNN not initialized” error). cuDNN still works, but when trying to check if a source build has succeeded this can be confusing.
EDIT: Nevermind, it looks like nothing else other than F.linear has this issue, so switching over to the state’d version (three edits) fixed this and is now running smoothly and quickly in FP16. Dongers up for pytorch, woo!
Probably would still make sense to add in a stateless cat() method, though, I suspect the way I’m doing it is inefficient.
Thanks,
Andy