Pytorch memory management

I begin to read pytorch source code in github to get some details about memory management when making inference. However, I don’t know the entry of related code and vert confused. Are there some tutorials or some suggestions? Thanks so much!

Could you explain a bit more what exactly you are interested in?
Are you trying to understand how the GPU memory management works with the caching allocator or anything else?

Right. Actually , when making inference, pytorch always allocated enough memory on device only once to meet the requirement of inference , or just allocated suitable memory while inference