CUDA memory issue probably by Python

Hello everyone, I need your kind help.

I have to run my U-net model on a large 3D image, so I chop it into smaller portions, run it individually, and then put them together.

code_1.py chops the large 3D image into smaller portions and calls code_2.py inside a for loop where each call contains a small piece of the (chopped) image.

Everything works fine when I call code_2.py inside code_1.py as:

for small_img in large_img_list:
	torch.cuda.empty_cache()
	os.system('%s %s %s' % ('python3', 'code_2.py', small_img))

However, when I put the code of code_2.py into a function inside the same file (code_2.py) and import this file into code_1.py and do function calls, I run out of CUDA memory. The code is below.

# This is code_1.py
from code_2 import *

for small_img in large_img_list:
	torch.cuda.empty_cache()
    RunModels(small_img)

Why is this? I clear every tensor I allocate inside code_2.py

torch.cuda.empty_cache()
del image
del mask

Maybe this is a python problem rather than PyTorch problem?

Your import might be already executing some code and maybe even initializing a (separate) CUDA context. You could check if my adding print statements to the methods on code_2 and also check the memory usage right after the import.

1 Like

Thank you for your answer. You are right, when I import code_2.py, it does execute lines in the global context inside code_2.py which is understandable but it doesn’t obviously execute the function RunModels() which is inside code_2.py and doing all the work.

However, when I print the memory allocation, it doesn’t allocate any memory right before and after import.

t = torch.cuda.get_device_properties(0).total_memory
r = torch.cuda.memory_reserved(0)
a = torch.cuda.memory_allocated(0)
f = r-a  # free inside reserved

print("\ttotal_memory\t", t)
print("\tmemory_reserved\t", r)
print("\tmemory_allocated\t", a)
print("\tfree inside reserved\t", f, "\n\n")

Output:

Before Importing code_2.py
total_memory 25396969472
memory_reserved 0
memory_allocated 0
free inside reserved 0

Code_2.py Global context is executed!

After importing code_2.py
total_memory 25396969472
memory_reserved 0
memory_allocated 0
free inside reserved 0

So the mystery still remains :thinking: