THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1532576128691/work/aten/src/THC/THCCachingAllocator.cpp line=507 error=3 : initialization error

Are you using multiprocessing (multiple workers in your DataLoader) and CUDA tensors?
If so, this post might be the answer.