there’s this weird thing happening with me, i have a custom Residual UNet, that has about 34M params, and 133MB, and input is of batch size 512, (6, 192, 192), everything should fit into memory, although it doesn’t, it crashes consuming the entire gpu memory
here’s the model: https://gist.github.com/satyajitghana/b24bfb66040e7dbecc0a7d4cf7d5fa32
here’s the colab file running the model: https://colab.research.google.com/drive/1uOOciei9ZJ0aFyVaRlr8SYAylGPZS3Qt?usp=sharing
i have no clue what the problem is, at one point of time everything had worked, on a P100, but that magic moment never came, i couldn’t figure out why it worked or why is it not working now anymore.
are there any memory leaks somewhere, somehow ? that i’m blind enough not to notice ?
even crazier thing is, sometimes even though i delete the model, i am not able to free the model’s occupied memory from gpu, it just stays there, until i reset the runtime, so the one time that the model had accidentally run successfully, i couldnt run the model in eval mode, it just crashed allocating memory.