So I’ve tried to compile Pytorch locally for development purposes under the (dev) anaconda environment and another one under (lisa) which I install from the instructions from pytorch.org, I’ve got a recent warning when running the VAE code on Pytorch tutorials when running it on the newest version.
(dev) [SLURM] suhubdyd@kepler2:~/research/models/pytorch-models/vae$ python main.py
THCudaCheck FAIL file=/u/suhubdyd/research/dl-frameworks/pytorch/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu line=247 error=8 : invalid device function
Traceback (most recent call last):
File "main.py", line 138, in <module>
train(epoch)
File "main.py", line 108, in train
recon_batch, mu, logvar = model(data)
File "/u/suhubdyd/.conda/envs/dev/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "main.py", line 71, in forward
mu, logvar = self.encode(x.view(-1, 784))
File "main.py", line 54, in encode
h1 = self.relu(self.fc1(x))
File "/u/suhubdyd/.conda/envs/dev/lib/python2.7/site-packages/torch/nn/modules/module.py", line 206, in __call__
result = self.forward(*input, **kwargs)
File "/u/suhubdyd/.conda/envs/dev/lib/python2.7/site-packages/torch/nn/modules/linear.py", line 54, in forward
return self._backend.Linear.apply(input, self.weight, self.bias)
File "/u/suhubdyd/.conda/envs/dev/lib/python2.7/site-packages/torch/nn/_functions/linear.py", line 14, in forward
output.add_(bias.expand_as(output))
RuntimeError: cuda runtime error (8) : invalid device function at /u/suhubdyd/research/dl-frameworks/pytorch/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:247
while it doesn’t matter for the stable distribution
(lisa) [SLURM] suhubdyd@kepler2:~/research/models/pytorch-models/vae$ python main.py
Train Epoch: 1 [0/60000 (0%)] Loss: 549.307800
Train Epoch: 1 [1280/60000 (2%)] Loss: 311.565613
Train Epoch: 1 [2560/60000 (4%)] Loss: 236.338989
Train Epoch: 1 [3840/60000 (6%)] Loss: 225.371078
Train Epoch: 1 [5120/60000 (9%)] Loss: 208.993668
Train Epoch: 1 [6400/60000 (11%)] Loss: 206.427368
Train Epoch: 1 [7680/60000 (13%)] Loss: 208.580597
Train Epoch: 1 [8960/60000 (15%)] Loss: 201.324646
Train Epoch: 1 [10240/60000 (17%)] Loss: 191.824326
Train Epoch: 1 [11520/60000 (19%)] Loss: 195.214188
Train Epoch: 1 [12800/60000 (21%)] Loss: 189.770447
Train Epoch: 1 [14080/60000 (23%)] Loss: 173.119644
Train Epoch: 1 [15360/60000 (26%)] Loss: 179.030197
Train Epoch: 1 [16640/60000 (28%)] Loss: 170.247345
Train Epoch: 1 [17920/60000 (30%)] Loss: 169.193451
Train Epoch: 1 [19200/60000 (32%)] Loss: 162.828690
Train Epoch: 1 [20480/60000 (34%)] Loss: 158.171326
Train Epoch: 1 [21760/60000 (36%)] Loss: 158.530518
Train Epoch: 1 [23040/60000 (38%)] Loss: 155.896255
Train Epoch: 1 [24320/60000 (41%)] Loss: 158.835968
Train Epoch: 1 [25600/60000 (43%)] Loss: 152.416977
Train Epoch: 1 [26880/60000 (45%)] Loss: 153.593964
Train Epoch: 1 [28160/60000 (47%)] Loss: 147.944260
Train Epoch: 1 [29440/60000 (49%)] Loss: 148.223892
Train Epoch: 1 [30720/60000 (51%)] Loss: 145.770905
Train Epoch: 1 [32000/60000 (53%)] Loss: 144.410706
Train Epoch: 1 [33280/60000 (55%)] Loss: 147.592163
Train Epoch: 1 [34560/60000 (58%)] Loss: 149.320328