I’ve recently implemented a DCGAN on pytorch which works fine on my local machine but when I tried to run it on a cluster I get this error.
RuntimeError: Assertion `x >= 0. && x <= 1.’ failed. input value should be between 0~1, but got -0.020724 at /opt/conda/conda-bld/pytorch_1503970438496/work/torch/lib/THNN/generic/BCECriterion.c:34
It becomes apparent that the error originates on the loss function, but it doesn’t make much sense to me since the same exact code works on my local machine.
Now, I havent worked in a cluster before, and I could only think of the following reasons:
- Mess up in the installations. Shouldn’t there be an error when importing the corresponding libraries if that was the case? importing works fine however I did try “conda install pytorch torchvision cuda80 -c soumith” again which responds that everything is in order.
- Python version. It’s 3.6 on both, I double checked.
- Maybe it has something to do with the way the cluster seeds so these values appear only because of that? I used manual seeding to test this but it didn’t change anything
One last thing I thought was that somehow the paths are messed up. /opt/conda/conda-bld/pytorch_1503970438496 which appears in the error doesnt exist in the cluster but how can I fix that?