Hey,
I’m experiencing an unusual bug on pytorch v0.4 for windows.
I’m training the same model using .cuda().half() and using .cuda() and i’m getting different results.
If i’m training with .cuda().half() the loss is nan after the second step, but if i’m training with just .cuda() the loss is decreasing but i can’t increase my batch size.
i’m using the following versions:
__Python VERSION: 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 10:22:32) [MSC v.1900 64 bit (AMD64)]
__pyTorch VERSION: 0.4.0
__CUDA VERSION
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2017 NVIDIA Corporation
Built on Fri_Sep__1_21:08:32_Central_Daylight_Time_2017
Cuda compilation tools, release 9.0, V9.0.176
__CUDNN VERSION: 7005
Anyone else experiencing this on the windows version?
Tal