Optimize memory usage for inference using Detectron model on CPU (caffe2)

bendidi · September 20, 2018, 9:43am

Hi there,

I’m using a detectron model that I converted to .pb format for cpu inference with caffe2, and i’m trying to optimize the memory usage to the maximum (less than 3Gb) , I’m trying to use the function found in here :

def optimize_net(net):
    optimization = memonger.optimize_interference(
        net,
        [b for b in net.external_input] +
        [b for b in net.external_output])
    try:
        # This can fail if the blobs aren't in the workspace.'
        stats = memonger.compute_statistics(optimization.assignments)
        print("Memory saving: {:.2f}%".format(
            float(stats.optimized_nbytes) / stats.baseline_nbytes * 100))
    except Exception as e:
        print(e)
    return pick_engines(share_conv_buffers(rename_blobs(optimization.net)))

memonger.optimize_interference didn’t work for me , so i used memonger.optimize_inference_fast, which did optimize about 1 Gb in memory , and now I’m looking for ways to opimize even more and also on how to implement the share_conv_buffers function

Thanks for your help