I wonder how PyTorch consumes GPU memory when initializing CUDA context? I did not find any materials to explain this, but it can consume about 700MiB device memory in the experiments.
I wonder how PyTorch consumes GPU memory when initializing CUDA context? I did not find any materials to explain this, but it can consume about 700MiB device memory in the experiments.