When running my Rainbow agent in Python 2, e.g. to quickly fail via python main.py --learn-start 1000 --evaluation-interval 1100 --disable-cuda, I get the error RuntimeError: invalid argument 2: out of range at /opt/conda/conda-bld/pytorch_1556653194318/work/aten/src/TH/generic/THTensor.cpp:639 on this line. In Python 2, the two last arguments get incredibly large negative numbers (e.g. -9223372036854775808), and -inf, but the same set of arguments under Python 3 get the expected inputs: 0, 1, …, 1630, and 0.0049, 0.0053… Not sure why changing Python version should cause this problem.
Using Python 1.1.0 with CUDA 10 for both Python 2 and 3.
Thanks for the reproducible code snippet.
In Python2.x the / operator is an integer division, if the inputs are integers (in Python3 it’s //).
Therefore, self.delta_z will be zero:
Ah thanks for spotting that! Not used to writing completely cross-compatible Python code, but someone requested and I thought I’d try. Probably the only place it is an issue, but sprinkling that import in any file with a division just in case With this I’ve hopefully tracked down any other cross-compatibility issues as I’ve now got it running fine on Python 2