Export detectron2 pre-trained mask-rcnn to ONNX with BatchNorm.

Hi folks,

BLOT: Need help exporting detectron2’s maskrcnn to ONNX along with the frozen batch norm layers.

I’m fairly new to detectron2 framework and had some issues exporting detectron2’s mask-rcnn to onnx, retaining the frozen batch norm layers from the torch model.

I have been successful in importing the resnet-50 mask-rcnn network using the code snippet below. But in this case, the frozen batch norm layers get optimized out/ constant-folded in the exported ONNX network.

modelName = "R_50_C4_3x"
modelYaml = "COCO-InstanceSegmentation/mask_rcnn_"+modelName+".yaml"
torch_model = model_zoo.get(modelYaml, trained=True)
cfg = get_cfg()
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(modelYaml)

# COCO dataset loader
data_loader = build_detection_test_loader(cfg, cfg.DATASETS.TEST[0], num_workers=0)
first_batch = next(iter(data_loader))

tracer = Caffe2Tracer(cfg, torch_model, first_batch)
onnx_model = tracer.export_onnx()
onnx.save(onnx_model, "mrcnn_onnx.onnx"))

Based off a conversation from this thread, my understanding is that, using the torch.onnx.export() function, we can turn off constant_folding and optimizations, when exported using the training=TrainingMode.TRAINING option. After which, the frozen batch norm layer wouldn’t get optimized out. However, while trying this out, I run into an error (pasted below)

Updated export code:

torch.onnx.export(torch_model, first_batch, "mrcnn.onnx", verbose=True, do_constant_folding=True) #, training=TrainingMode.TRAINING)

Error message:

File "/Users/ad/anaconda3/envs/torch/lib/python3.9/site-packages/torch/onnx/utils.py", line 632, in _export
    _model_to_graph(model, args, verbose, input_names,
  File "/Users/ad/anaconda3/envs/torch/lib/python3.9/site-packages/torch/onnx/utils.py", line 409, in _model_to_graph
    graph, params, torch_out = _create_jit_graph(model, args,
  File "/Users/ad/anaconda3/envs/torch/lib/python3.9/site-packages/torch/onnx/utils.py", line 379, in _create_jit_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args)
  File "/Users/ad/anaconda3/envs/torch/lib/python3.9/site-packages/torch/onnx/utils.py", line 342, in _trace_and_get_graph_from_model
    torch.jit._get_trace_graph(model, args, strict=False, _force_outplace=False, _return_inputs_states=True)
  File "/Users/ad/anaconda3/envs/torch/lib/python3.9/site-packages/torch/jit/_trace.py", line 1149, in _get_trace_graph
    outs = ONNXTracedModule(f, strict, _force_outplace, return_inputs, _return_inputs_states)(*args, **kwargs)
  File "/Users/ad/anaconda3/envs/torch/lib/python3.9/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/Users/ad/anaconda3/envs/torch/lib/python3.9/site-packages/torch/jit/_trace.py", line 126, in forward
    graph, out = torch._C._create_graph_by_tracing(
  File "/Users/ad/anaconda3/envs/torch/lib/python3.9/site-packages/torch/jit/_trace.py", line 120, in wrapper
    out_vars, _ = _flatten(outs)

RuntimeError: Only tuples, lists and Variables supported as JIT inputs/outputs. Dictionaries and strings are also accepted but their usage is not recommended. But got unsupported type Instances

I faced a similar issue with _flatten(in) , which was resolved by converting all the scalars to tensors. But here, the outs from detectron2 is an object of type ‘Instances’. A few questions here -

  1. Why does the exporter care about the final format after post-processing?
  2. Any suggestions on how can I resolve this error? or,
  3. Any leads on how to export detectron2’s maskrcnn to ONNX with the batch norm layers will be much appreciated.

Thanks in advance!

This issue is addressing the export of fixed batch norm layers from the model. But, currently I’m unable to successfully export any form of the network using the torch.onnx.export(torch_model, first_batch, "mrcnn.onnx") syntax.