A strange Cuda out of memory

fabio_carbone · March 22, 2019, 2:39pm

I am testing pix2pixHD. It works on my local machine, but it raise an error in a cloud server machine. The strange think is that the server machine is more powerful.

Here the details:

LOCAL MACHINE
Ubuntu 16.04
GPU: Geforce GTX 1050 - 4GB GPU Memory
Pytorch version: 0.4.0
Cuda 9.0

SERVER MACHINE
Ubuntu 16.04
GPU: Geforce GTX 1080 - 8GB GPU Memory
Pytorch version: 1.0
Cuda 10.0

In the server machine I get:

RuntimeError: CUDA out of memory. Tried to allocate 256.00 MiB (GPU 0; 7.93 GiB total capacity; 7.27 GiB already allocated; 92.19 MiB free; 33.05 MiB cached)

I run nvidia-smi before launch the script and I get:

No running process found
Memory usage: 0Mib/8119Mib

How that is possible?
Could be the difference in Pytorch and Cuda version?

fabio_carbone · March 23, 2019, 10:53am

Downgrading Pytorch to 0.4.1 solve the issue.
Could related to the new memory management of Pytorch

zhuzhu181 · March 23, 2019, 2:18pm

use command “watch -n 1 -d nvidia-smi” to view your realtime memory.