Profilig Memory Usage

dnlwbr · August 9, 2022, 12:36pm

Hey everybody,

I am currently trying to figure out how much memory different models need for the forwardpass on the CPU (I know GPU is much faster ;)). I came across the PyTorch Profiler, but I have problems to interpret the results.

With

    with profile(activities=activities, record_shapes=True, profile_memory=True) as prof:
        with record_function("model_inference"):
            model.eval()
            prediction = model([inp])
    print(prof.key_averages().table(top_level_events_only=True, row_limit=10))

I get the following output:

===============================================================================
This report only display top-level ops statistics

                                   Name    Self CPU %      Self CPU   CPU total %     CPU total  CPU time avg       CPU Mem  Self CPU Mem    # of Calls  

                            aten::zeros         0.00%      58.000us         0.31%      18.145ms       1.008ms   98039.65 Kb     -19.59 Kb            18  
                        model_inference         3.15%     182.338ms       100.00%        5.788s        5.788s  592648.17 Kb  -1799765.56 Kb             1

Self CPU time total: 5.788s

I know that Self CPU refers only to the particular function without the child processes and CPU total includes them. With memory I think it will be analogous. However, CPU Mem states 592648.17 Kb, while including the child processes 1799765.56 Kb have been released. What does this mean exactly? Can I say that a total of 1799765.56 Kb was needed or how do I get the total memory needed from the whole model? Is this a reasonable way at all, or should I rather use psutil (e.g. psutil.Process().memory_info().vms)? However, here the copmlete Python process is specified including imported libraries as far as I know.

I would be happy if someone can help me. Thank you!