Exporting FasterRCNN (fasterrcnn_resnet50_fpn) to ONNX

I am trying to export a fine tuned faster rcnn model to ONNX. For training I am following the torchvision object detection fine tuning tutorial here.
My script for converting the trained model to ONNX is as follows:

from torch.autograd import Variable
import torch.onnx
import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictor
from torchvision import transforms
from PIL import Image

def construct_model(num_classes):
    # load a model pre-trained pre-trained on COCO
    model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)

    # get number of input features for the classifier
    in_features = model.roi_heads.box_predictor.cls_score.in_features
    # replace the pre-trained head with a new one
    model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)

    return model

image = Image.open('1080p.jpg')

transformation = transforms.Compose([
    transforms.ToTensor(),
    ])

x = transformation(image)
x = x.to('cpu')

model = construct_model(2)
model.load_state_dict(torch.load('model.pt', map_location='cpu'))
model.eval()

# Export model to onnx format
torch.onnx.export(model, [x], "model.onnx", verbose=True, opset_version=10, strip_doc_string=True, do_constant_folding=True)

The output of the above script is as follows:

/home/train/anaconda3/lib/python3.7/site-packages/torch/tensor.py:389: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  'incorrect results).', category=RuntimeWarning)
Traceback (most recent call last):
  File "alexnet2onnx.py", line 7, in <module>
    torch.onnx.export(model, dummy_input, "alexnet.onnx", verbose=True)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/onnx/__init__.py", line 132, in export
    strip_doc_string, dynamic_axes)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/onnx/utils.py", line 64, in export
    example_outputs=example_outputs, strip_doc_string=strip_doc_string, dynamic_axes=dynamic_axes)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/onnx/utils.py", line 329, in _export
    _retain_param_name, do_constant_folding)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/onnx/utils.py", line 213, in _model_to_graph
    graph, torch_out = _trace_and_get_graph_from_model(model, args, training)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/onnx/utils.py", line 171, in _trace_and_get_graph_from_model
    trace, torch_out = torch.jit.get_trace_graph(model, args, _force_outplace=True)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/jit/__init__.py", line 256, in get_trace_graph
    return LegacyTracedModule(f, _force_outplace, return_inputs)(*args, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/jit/__init__.py", line 323, in forward
    out = self.inner(*trace_inputs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 531, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torchvision/models/detection/generalized_rcnn.py", line 48, in forward
    features = self.backbone(images.tensors)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 531, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 531, in _slow_forward
    result = self.forward(*input, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torchvision/models/_utils.py", line 58, in forward
    x = module(x)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 545, in __call__
    result = self._slow_forward(*input, **kwargs)
  File "/home/train/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 531, in _slow_forward
    result = self.forward(*input, **kwargs)
RuntimeError: Tried to trace <__module__.FrozenBatchNorm2d object at 0x562d994c64f0> but it is not part of the active trace. Modules that are called during a trace must be registered as submodules of the thing being traced

To make sure the error is not cause by my training I tried to convert the pretrained model without any further training. This gives the same error message as before.

It is clear from the message that the error is with the FrozenBatchNorm2d not being registered in the traced module. Unfortunately I do not know how to do this.

Was anybody able to successfully convert the torchvision FasterRCNN model to onnx or can someone reproduce the error I am seeing?

Information about my environment:

PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130

OS: Ubuntu 16.04.6 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
CMake version: version 3.5.1

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 8.0.61
GPU models and configuration: 
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti

Nvidia driver version: 418.56
cuDNN version: Could not collect

Versions of relevant libraries:
[pip3] numpy==1.16.4
[conda] blas                      1.0                         mkl  
[conda] mkl                       2019.4                      243  
[conda] mkl-service               2.0.2            py37h7b6447c_0  
[conda] mkl_fft                   1.0.12           py37ha843d7b_0  
[conda] mkl_random                1.0.2            py37hd81dba3_0  
[conda] torch                     1.2.0                    pypi_0    pypi
[conda] torchvision               0.4.0                    pypi_0    pypi

Struggling with exactly the same problem; anyone got any additional information? Or, @PixR2, have you made any headway?

Unfortunately I was not able to fix the problem. I ended up going with another model for now. Maybe I will find some time to investigate the issue further in the next weeks. If I find a solution I will let you know.

Do you mind if I ask which model?

I tried this version of YOLOv3: https://github.com/eriklindernoren/PyTorch-YOLOv3 , but exporting it has its own problems.

Hi, which model did you used? Facing same issue

Anybody had any luck yet?

Same problem here. Has anyone made any progress regarding this issue?