Torch.jit.trace is not working with mask rcnn

WaterKnight · May 28, 2020, 2:18pm

How could I get a torchscript version of torchvision.models.detection. maskrcnn_resnet50_fpn?

torch.jit.script and torch.jit.tarce are not working with this model

With torch.jit.script

model = torch.load(modelname+"-best.pth")
model=model.cuda()
model.eval()
print(img)
with torch.no_grad():
    print(model(img))
    traced_cell = torch.jit.script(model, (img))
torch.jit.save(traced_cell, modelname+"-torchscript.pth")

loaded_trace = torch.jit.load(modelname+"-torchscript.pth")
loaded_trace.eval()
with torch.no_grad():
    print(loaded_trace(img))
    
TensorMask(torch.argmax(loaded_trace(img),1)).show()

Output:

TensorImage([[[[0.8961, 0.9132, 0.8789,  ..., 0.2453, 0.1939, 0.2282],
          [0.8276, 0.9132, 0.8618,  ..., 0.2282, 0.1939, 0.2282],
          [0.8961, 0.9132, 0.8789,  ..., 0.2282, 0.2282, 0.2453],
          ...,
          [0.8961, 0.8618, 0.9132,  ..., 0.4508, 0.4166, 0.3994],
          [0.9303, 0.9132, 0.9474,  ..., 0.4166, 0.4166, 0.4508],
          [0.9646, 0.8789, 0.9303,  ..., 0.3994, 0.3994, 0.3994]],

         [[1.0455, 1.0630, 1.0280,  ..., 0.3803, 0.3277, 0.3627],
          [0.9755, 1.0630, 1.0105,  ..., 0.3627, 0.3277, 0.3627],
          [1.0455, 1.0630, 1.0280,  ..., 0.3627, 0.3627, 0.3803],
          ...,
          [1.0455, 1.0105, 1.0630,  ..., 0.5903, 0.5553, 0.5378],
          [1.0805, 1.0630, 1.0980,  ..., 0.5553, 0.5553, 0.5903],
          [1.1155, 1.0280, 1.0805,  ..., 0.5378, 0.5378, 0.5378]],

         [[1.2631, 1.2805, 1.2457,  ..., 0.6008, 0.5485, 0.5834],
          [1.1934, 1.2805, 1.2282,  ..., 0.5834, 0.5485, 0.5834],
          [1.2631, 1.2805, 1.2457,  ..., 0.5834, 0.5834, 0.6008],
          ...,
          [1.2631, 1.2282, 1.2805,  ..., 0.8099, 0.7751, 0.7576],
          [1.2980, 1.2805, 1.3154,  ..., 0.7751, 0.7751, 0.8099],
          [1.3328, 1.2457, 1.2980,  ..., 0.7576, 0.7576, 0.7576]]]],
       device='cuda:0')
[{'boxes': tensor([[412.5222, 492.3208, 619.7662, 620.9233]], device='cuda:0'), 'labels': tensor([1], device='cuda:0'), 'scores': tensor([0.1527], device='cuda:0'), 'masks': tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]]], device='cuda:0')}]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-23-7216a0dac5a0> in <module>
     12 loaded_trace.eval()
     13 with torch.no_grad():
---> 14     print(loaded_trace(img))
     15 
     16 TensorMask(torch.argmax(loaded_trace(img),1)).show()

~/anaconda3/envs/pro1/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    556             result = self._slow_forward(*input, **kwargs)
    557         else:
--> 558             result = self.forward(*input, **kwargs)
    559         for hook in self._forward_hooks.values():
    560             hook_result = hook(self, input, result)

RuntimeError: forward() Expected a value of type 'List[Tensor]' for argument 'images' but instead found type 'TensorImage'.
Position: 1
Value: TensorImage([[[[0.8961, 0.9132, 0.8789,  ..., 0.2453, 0.1939, 0.2282],
          [0.8276, 0.9132, 0.8618,  ..., 0.2282, 0.1939, 0.2282],
          [0.8961, 0.9132, 0.8789,  ..., 0.2282, 0.2282, 0.2453],
          ...,
          [0.8961, 0.8618, 0.9132,  ..., 0.4508, 0.4166, 0.3994],
          [0.9303, 0.9132, 0.9474,  ..., 0.4166, 0.4166, 0.4508],
          [0.9646, 0.8789, 0.9303,  ..., 0.3994, 0.3994, 0.3994]],

         [[1.0455, 1.0630, 1.0280,  ..., 0.3803, 0.3277, 0.3627],
          [0.9755, 1.0630, 1.0105,  ..., 0.3627, 0.3277, 0.3627],
          [1.0455, 1.0630, 1.0280,  ..., 0.3627, 0.3627, 0.3803],
          ...,
          [1.0455, 1.0105, 1.0630,  ..., 0.5903, 0.5553, 0.5378],
          [1.0805, 1.0630, 1.0980,  ..., 0.5553, 0.5553, 0.5903],
          [1.1155, 1.0280, 1.0805,  ..., 0.5378, 0.5378, 0.5378]],

         [[1.2631, 1.2805, 1.2457,  ..., 0.6008, 0.5485, 0.5834],
          [1.1934, 1.2805, 1.2282,  ..., 0.5834, 0.5485, 0.5834],
          [1.2631, 1.2805, 1.2457,  ..., 0.5834, 0.5834, 0.6008],
          ...,
          [1.2631, 1.2282, 1.2805,  ..., 0.8099, 0.7751, 0.7576],
          [1.2980, 1.2805, 1.3154,  ..., 0.7751, 0.7751, 0.8099],
          [1.3328, 1.2457, 1.2980,  ..., 0.7576, 0.7576, 0.7576]]]],
       device='cuda:0')
Declaration: forward(__torch__.torchvision.models.detection.mask_rcnn.___torch_mangle_1723.MaskRCNN self, Tensor[] images, Dict(str, Tensor)[]? targets=None) -> ((Dict(str, Tensor), Dict(str, Tensor)[]))
Cast error details: Unable to cast Python instance to C++ type (compile in debug mode for details)

With torch.jit.trace

modelname="maskrcnn"
model = torch.load(modelname+"-best.pth")
model=model.cuda()
model.eval()
print(img)
with torch.no_grad():
    print(model(img))
    traced_cell = torch.jit.trace(model, (img))
torch.jit.save(traced_cell, modelname+"-torchscript.pth")

loaded_trace = torch.jit.load(modelname+"-torchscript.pth")
loaded_trace.eval()
with torch.no_grad():
    print(loaded_trace(img))
    
TensorMask(torch.argmax(loaded_trace(img),1)).show()

Output

TensorImage([[[[0.8961, 0.9132, 0.8789,  ..., 0.2453, 0.1939, 0.2282],
          [0.8276, 0.9132, 0.8618,  ..., 0.2282, 0.1939, 0.2282],
          [0.8961, 0.9132, 0.8789,  ..., 0.2282, 0.2282, 0.2453],
          ...,
          [0.8961, 0.8618, 0.9132,  ..., 0.4508, 0.4166, 0.3994],
          [0.9303, 0.9132, 0.9474,  ..., 0.4166, 0.4166, 0.4508],
          [0.9646, 0.8789, 0.9303,  ..., 0.3994, 0.3994, 0.3994]],

         [[1.0455, 1.0630, 1.0280,  ..., 0.3803, 0.3277, 0.3627],
          [0.9755, 1.0630, 1.0105,  ..., 0.3627, 0.3277, 0.3627],
          [1.0455, 1.0630, 1.0280,  ..., 0.3627, 0.3627, 0.3803],
          ...,
          [1.0455, 1.0105, 1.0630,  ..., 0.5903, 0.5553, 0.5378],
          [1.0805, 1.0630, 1.0980,  ..., 0.5553, 0.5553, 0.5903],
          [1.1155, 1.0280, 1.0805,  ..., 0.5378, 0.5378, 0.5378]],

         [[1.2631, 1.2805, 1.2457,  ..., 0.6008, 0.5485, 0.5834],
          [1.1934, 1.2805, 1.2282,  ..., 0.5834, 0.5485, 0.5834],
          [1.2631, 1.2805, 1.2457,  ..., 0.5834, 0.5834, 0.6008],
          ...,
          [1.2631, 1.2282, 1.2805,  ..., 0.8099, 0.7751, 0.7576],
          [1.2980, 1.2805, 1.3154,  ..., 0.7751, 0.7751, 0.8099],
          [1.3328, 1.2457, 1.2980,  ..., 0.7576, 0.7576, 0.7576]]]],
       device='cuda:0')
[{'boxes': tensor([[412.5222, 492.3208, 619.7662, 620.9233]], device='cuda:0'), 'labels': tensor([1], device='cuda:0'), 'scores': tensor([0.1527], device='cuda:0'), 'masks': tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]]], device='cuda:0')}]
/opt/conda/conda-bld/pytorch_1587452831668/work/torch/csrc/utils/python_arg_parser.cpp:760: UserWarning: This overload of nonzero is deprecated:
	nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
	nonzero(Tensor input, *, bool as_tuple)
/home/david/anaconda3/envs/proy/lib/python3.7/site-packages/torch/tensor.py:467: RuntimeWarning: Iterating over a tensor might cause the trace to be incorrect. Passing a tensor of different shape won't change the number of iterations executed (and might lead to errors or silently give incorrect results).
  'incorrect results).', category=RuntimeWarning)
/home/david/anaconda3/envs/proy/lib/python3.7/site-packages/fastai2/torch_core.py:272: TracerWarning: Converting a tensor to a Python index might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  res = getattr(super(TensorBase, self), fn)(*args, **kwargs)
/opt/conda/conda-bld/pytorch_1587452831668/work/aten/src/ATen/native/BinaryOps.cpp:81: UserWarning: Integer division of tensors using div or / is deprecated, and in a future release div will perform true division as in Python 3. Use true_divide or floor_divide (// in Python) instead.
/home/david/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/rpn.py:164: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  torch.tensor(image_size[1] / g[1], dtype=torch.int64, device=device)] for g in grid_sizes]
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-15-44b7a9360e87> in <module>
      6 with torch.no_grad():
      7     print(model(img))
----> 8     traced_cell = torch.jit.trace(model, (img))
      9 torch.jit.save(traced_cell, modelname+"-torchscript.pth")
     10 

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/jit/__init__.py in trace(func, example_inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
    881         return trace_module(func, {'forward': example_inputs}, None,
    882                             check_trace, wrap_check_inputs(check_inputs),
--> 883                             check_tolerance, strict, _force_outplace, _module_class)
    884 
    885     if (hasattr(func, '__self__') and isinstance(func.__self__, torch.nn.Module) and

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/jit/__init__.py in trace_module(mod, inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
   1035             func = mod if method_name == "forward" else getattr(mod, method_name)
   1036             example_inputs = make_tuple(example_inputs)
-> 1037             module._c._create_method_from_trace(method_name, func, example_inputs, var_lookup_fn, strict, _force_outplace)
   1038             check_trace_method = module._c._get_method(method_name)
   1039 

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    554                 input = result
    555         if torch._C._get_tracing_state():
--> 556             result = self._slow_forward(*input, **kwargs)
    557         else:
    558             result = self.forward(*input, **kwargs)

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
    540                 recording_scopes = False
    541         try:
--> 542             result = self.forward(*input, **kwargs)
    543         finally:
    544             if recording_scopes:

~/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/generalized_rcnn.py in forward(self, images, targets)
     68         if isinstance(features, torch.Tensor):
     69             features = OrderedDict([('0', features)])
---> 70         proposals, proposal_losses = self.rpn(images, features, targets)
     71         detections, detector_losses = self.roi_heads(features, proposals, images.image_sizes, targets)
     72         detections = self.transform.postprocess(detections, images.image_sizes, original_image_sizes)

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    554                 input = result
    555         if torch._C._get_tracing_state():
--> 556             result = self._slow_forward(*input, **kwargs)
    557         else:
    558             result = self.forward(*input, **kwargs)

~/anaconda3/envs/proy/lib/python3.7/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
    540                 recording_scopes = False
    541         try:
--> 542             result = self.forward(*input, **kwargs)
    543         finally:
    544             if recording_scopes:

~/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/rpn.py in forward(self, images, features, targets)
    486         proposals = self.box_coder.decode(pred_bbox_deltas.detach(), anchors)
    487         proposals = proposals.view(num_images, -1, 4)
--> 488         boxes, scores = self.filter_proposals(proposals, objectness, images.image_sizes, num_anchors_per_level)
    489 
    490         losses = {}

~/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/rpn.py in filter_proposals(self, proposals, objectness, image_shapes, num_anchors_per_level)
    392 
    393         # select top_n boxes independently per level before applying nms
--> 394         top_n_idx = self._get_top_n_idx(objectness, num_anchors_per_level)
    395 
    396         image_range = torch.arange(num_images, device=device)

~/anaconda3/envs/proy/lib/python3.7/site-packages/torchvision/models/detection/rpn.py in _get_top_n_idx(self, objectness, num_anchors_per_level)
    372                 pre_nms_top_n = min(self.pre_nms_top_n(), num_anchors)
    373             _, top_n_idx = ob.topk(pre_nms_top_n, dim=1)
--> 374             r.append(top_n_idx + offset)
    375             offset += num_anchors
    376         return torch.cat(r, dim=1)

RuntimeError: expected device cuda:0 but got device cpu

Michael_Suo · May 28, 2020, 4:23pm

what version of PyTorch/Torchvision are you using? Afaik this should work on the latest for both (cc @fmassa)

WaterKnight · May 28, 2020, 4:27pm

print(torch.__version__)

print(torchvision.__version__)

1.6.0.dev20200421 
0.7.0a0+6e47842

ptrblck · May 29, 2020, 9:11am

Could you check, if passing the inputs as a list would solve the error, as claimed in the error message for scripting?
This code works for me:

import torch
import torchvision
model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=False)
model.eval()
scripted_model = torch.jit.script(model)
out = scripted_model([torch.randn(3, 224, 224), torch.randn(3, 400, 400)])

WaterKnight · May 29, 2020, 9:59am

I have tried the following:

modelname="maskrcnn"
model = torch.load(modelname+"-best.pth")
model=model.cuda()
model.eval()
with torch.no_grad():
    print(model(img.clone()))
    traced_cell = torch.jit.script(model)
traced_cell.save(modelname+"-torchscript.pth")

loaded_trace = torch.jit.load(modelname+"-torchscript.pth")
loaded_trace.eval()
with torch.no_grad():
    print(loaded_trace([img[0]]))

However, the output of the model is looking different now!

[{'boxes': tensor([[412.5222, 492.3208, 619.7662, 620.9233]], device='cuda:0'), 'labels': tensor([1], device='cuda:0'), 'scores': tensor([0.1527], device='cuda:0'), 'masks': tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]]], device='cuda:0')}]
({}, [{'scores': tensor([0.1527], device='cuda:0'), 'labels': tensor([1], device='cuda:0'), 'boxes': tensor([[412.5222, 492.3208, 619.7662, 620.9233]], device='cuda:0'), 'masks': tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          ...,
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.],
          [0., 0., 0.,  ..., 0., 0., 0.]]]], device='cuda:0')}])
code/__torch__/torchvision/models/detection/mask_rcnn.py:42: UserWarning: RCNN always returns a (Losses, Detections) tuple in scripting

It is returning a tuple now! @ptrblck

ptrblck · May 29, 2020, 10:16am

I’m not sure, why the tuple is returned, but as a workaround you could just remove the first element, as the others yield the same result.

WaterKnight · May 29, 2020, 10:17am

Yes, i am doing it right now. However, is quite strange!

ptrblck · May 29, 2020, 10:19am

For multiple images, the first dict will also be empty, so I might be missing something obvious, but I don’t know what it could be used for.

lysuhin · September 25, 2020, 3:28pm

If anyone is still interested, PyTorch throws a UserWarning explaining what’s what:

code/__torch__/torchvision/models/detection/keypoint_rcnn.py:86: UserWarning: RCNN always returns a (Losses, Detections) tuple in scripting

craston · December 11, 2020, 10:10am

Hi,
I am facing an issue with multiple forward passes using torchscript on GPU. It works fine on the first pass but throws an error on the next. However, it works perfectly on CPU.

model = torchvision.models.detection.maskrcnn_resnet50_fpn(pretrained=False).cuda()
model.eval()
scripted_model = torch.jit.script(model)

torch.jit.save(scripted_model, "maskrcnn_torchscript.pth")

scripted_model = torch.jit.load('maskrcnn_torchscript.pth').eval()
x = cv2.imread('image.jpg')
x_tensor = torch.as_tensor(x.astype("float32").transpose(2, 0, 1)).cuda()

with torch.no_grad():
    count = 10
    while count > 0:
        t0 = time()
        out = scripted_model([x_tensor])
        t1 = time()
        print("Time = {}, FPS = {}".format(t1 - t0, 1/(t1 -t0)))

        count -= 1

The first pass works fine and on the next pass I get the following error

code/__torch__/torchvision/models/detection/mask_rcnn.py:95: UserWarning: RCNN always returns a (Losses, Detections) tuple in scripting
Time = 1.6492173671722412, FPS = 0.6063482109181317
Traceback (most recent call last):
  File "simple_torchscript.py", line 24, in <module>
    out = scripted_model([x_tensor])
  File "/path/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
RuntimeError:
Arguments for call are not valid.
The following variants are available:

  aten::size.int(Tensor self, int dim) -> (int):
  Expected a value of type 'Tensor' for argument 'self' but instead found type 'List[Tensor]'.
  Empty lists default to List[Tensor]. Add a variable annotation to the assignment to create an empty list of another type (torch.jit.annotate(List[T, []]) where T is the type of elements in the list for Python 2)

  aten::size.Dimname(Tensor self, str dim) -> (int):
  Expected a value of type 'Tensor' for argument 'self' but instead found type 'List[Tensor]'.
  Empty lists default to List[Tensor]. Add a variable annotation to the assignment to create an empty list of another type (torch.jit.annotate(List[T, []]) where T is the type of elements in the list for Python 2)

  aten::size(Tensor self) -> (int[]):
  Expected a value of type 'Tensor' for argument 'self' but instead found type 'List[Tensor]'.
  Empty lists default to List[Tensor]. Add a variable annotation to the assignment to create an empty list of another type (torch.jit.annotate(List[T, []]) where T is the type of elements in the list for Python 2)

The original call is:

However, if I run it on CPU, it works perfectly.

code/__torch__/torchvision/models/detection/mask_rcnn.py:95: UserWarning: RCNN always returns a (Losses, Detections) tuple in scripting
Time = 2.616811990737915, FPS = 0.3821443816137551
Time = 3.022855758666992, FPS = 0.3308130059242312
Time = 2.174025297164917, FPS = 0.4599762483463605
Time = 2.174373149871826, FPS = 0.45990266208858743
Time = 2.098878860473633, FPS = 0.47644483863844345
Time = 2.3181285858154297, FPS = 0.43138245484697213
Time = 2.0808985233306885, FPS = 0.4805616366142636
Time = 2.05479097366333, FPS = 0.48666750672803294
Time = 1.95816969871521, FPS = 0.5106809694053165
Time = 2.1022541522979736, FPS = 0.47567987862309613

Here’s the details of the versions used.

----------------------------------------------------------------------------------------------------------------
python               3.6.12 |Anaconda, Inc.| (default, Sep  8 2020, 23:10:56) [GCC 7.3.0]
Numpy                1.19.2
PyTorch              1.7.0
torchvision          0.8.1
GPU 0,1,2,3          Quadro RTX 6000
CUDA_HOME            /usr/local/cuda-10.2
NVCC                 Cuda compilation tools, release 10.2, V10.2.89
cv2                  4.4.0
-------------------  --------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - CUDA Runtime 10.2
  - CuDNN 7.6.5

Oli · March 31, 2022, 2:52pm

I have a similar problem on pytorch 1.11.0 and torchvision==0.12.0 where I can’t combine a loop and a with torch.no_grad() statement.

    for _ in range(2):  # Works
        model.predict(data)

    with torch.no_grad():  # Works
        model.predict(data)

    with torch.no_grad():  # RuntimeError: Global alloc not supported yet
        for _ in range(2):
            model.predict(data)

The model is a pytorch lightning module containing a mobilenet_v3_small with a feature pyramid network and faster RCNN head constructed from torchvision modules.

Oli · April 4, 2022, 7:39am

Found a work-around for this RuntimeError: Global alloc not supported yet in TorchScript · Issue #69078 · pytorch/pytorch · GitHub