I have fine-tuned a fasterrcnn_resnet50_fpn_v2 model to deploy inside an R package. However, I am running into problems saving and loading the model architecture.
This is the code used to initiate and save the model:
model = fasterrcnn_resnet50_fpn_v2(weights=FasterRCNN_ResNet50_FPN_V2_Weights.DEFAULT)
model.eval()
s = torch.jit.script(model.to(device='cpu')
torch.jit.save(s, "/fasterrcnnArch.pt")
When reviewing the printed script model, it looks like hyperparameters are not printed when compared to printing the original model architecture (output reduced for brevity):
print(model)
FasterRCNN(
(transform): GeneralizedRCNNTransform(
Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
Resize(min_size=(800,), max_size=1333, mode='bilinear')
)
(backbone): BackboneWithFPN(
(body): IntermediateLayerGetter(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
(2): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
...
...
...
)
)
)
(fpn): FeaturePyramidNetwork(
(inner_blocks): ModuleList(
(0): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): Conv2dNormActivation(
(0): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): Conv2dNormActivation(
(0): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): Conv2dNormActivation(
(0): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(layer_blocks): ModuleList(
(0): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(1): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(2): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(3): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(extra_blocks): LastLevelMaxPool()
)
)
(rpn): RegionProposalNetwork(
(anchor_generator): AnchorGenerator()
(head): RPNHead(
...
...
)
)
(roi_heads): RoIHeads(
(box_roi_pool): MultiScaleRoIAlign(featmap_names=['0', '1', '2', '3'], output_size=(7, 7), sampling_ratio=2)
(box_head): FastRCNNConvFCHead(
(0): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
)
(1): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
)
(2): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
)
(3): Conv2dNormActivation(
(0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
)
(4): Flatten(start_dim=1, end_dim=-1)
(5): Linear(in_features=12544, out_features=1024, bias=True)
(6): ReLU(inplace=True)
)
(box_predictor): FastRCNNPredictor(
(cls_score): Linear(in_features=1024, out_features=91, bias=True)
(bbox_pred): Linear(in_features=1024, out_features=364, bias=True)
)
)
)
print(s)
RecursiveScriptModule(
original_name=FasterRCNN
(transform): RecursiveScriptModule(original_name=GeneralizedRCNNTransform)
(backbone): RecursiveScriptModule(
original_name=BackboneWithFPN
(body): RecursiveScriptModule(
original_name=IntermediateLayerGetter
(conv1): RecursiveScriptModule(original_name=Conv2d)
(bn1): RecursiveScriptModule(original_name=BatchNorm2d)
(relu): RecursiveScriptModule(original_name=ReLU)
(maxpool): RecursiveScriptModule(original_name=MaxPool2d)
(layer1): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(
original_name=Bottleneck
(conv1): RecursiveScriptModule(original_name=Conv2d)
(bn1): RecursiveScriptModule(original_name=BatchNorm2d)
(conv2): RecursiveScriptModule(original_name=Conv2d)
(bn2): RecursiveScriptModule(original_name=BatchNorm2d)
(conv3): RecursiveScriptModule(original_name=Conv2d)
(bn3): RecursiveScriptModule(original_name=BatchNorm2d)
(relu): RecursiveScriptModule(original_name=ReLU)
(downsample): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
)
(1): RecursiveScriptModule(
original_name=Bottleneck
(conv1): RecursiveScriptModule(original_name=Conv2d)
(bn1): RecursiveScriptModule(original_name=BatchNorm2d)
(conv2): RecursiveScriptModule(original_name=Conv2d)
(bn2): RecursiveScriptModule(original_name=BatchNorm2d)
(conv3): RecursiveScriptModule(original_name=Conv2d)
(bn3): RecursiveScriptModule(original_name=BatchNorm2d)
(relu): RecursiveScriptModule(original_name=ReLU)
)
(2): RecursiveScriptModule(
original_name=Bottleneck
(conv1): RecursiveScriptModule(original_name=Conv2d)
(bn1): RecursiveScriptModule(original_name=BatchNorm2d)
(conv2): RecursiveScriptModule(original_name=Conv2d)
(bn2): RecursiveScriptModule(original_name=BatchNorm2d)
(conv3): RecursiveScriptModule(original_name=Conv2d)
(bn3): RecursiveScriptModule(original_name=BatchNorm2d)
(relu): RecursiveScriptModule(original_name=ReLU)
)
)
(layer2): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(
original_name=Bottleneck
(conv1): RecursiveScriptModule(original_name=Conv2d)
(bn1): RecursiveScriptModule(original_name=BatchNorm2d)
(conv2): RecursiveScriptModule(original_name=Conv2d)
(bn2): RecursiveScriptModule(original_name=BatchNorm2d)
(conv3): RecursiveScriptModule(original_name=Conv2d)
(bn3): RecursiveScriptModule(original_name=BatchNorm2d)
(relu): RecursiveScriptModule(original_name=ReLU)
(downsample): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
)
(1): RecursiveScriptModule(
original_name=Bottleneck
(conv1): RecursiveScriptModule(original_name=Conv2d)
(bn1): RecursiveScriptModule(original_name=BatchNorm2d)
(conv2): RecursiveScriptModule(original_name=Conv2d)
(bn2): RecursiveScriptModule(original_name=BatchNorm2d)
(conv3): RecursiveScriptModule(original_name=Conv2d)
(bn3): RecursiveScriptModule(original_name=BatchNorm2d)
(relu): RecursiveScriptModule(original_name=ReLU)
)
(2): RecursiveScriptModule(
original_name=Bottleneck
(conv1): RecursiveScriptModule(original_name=Conv2d)
(bn1): RecursiveScriptModule(original_name=BatchNorm2d)
(conv2): RecursiveScriptModule(original_name=Conv2d)
(bn2): RecursiveScriptModule(original_name=BatchNorm2d)
(conv3): RecursiveScriptModule(original_name=Conv2d)
(bn3): RecursiveScriptModule(original_name=BatchNorm2d)
(relu): RecursiveScriptModule(original_name=ReLU)
)
(3): RecursiveScriptModule(
original_name=Bottleneck
(conv1): RecursiveScriptModule(original_name=Conv2d)
(bn1): RecursiveScriptModule(original_name=BatchNorm2d)
(conv2): RecursiveScriptModule(original_name=Conv2d)
(bn2): RecursiveScriptModule(original_name=BatchNorm2d)
(conv3): RecursiveScriptModule(original_name=Conv2d)
(bn3): RecursiveScriptModule(original_name=BatchNorm2d)
(relu): RecursiveScriptModule(original_name=ReLU)
)
)
...
...
...
)
)
(fpn): RecursiveScriptModule(
original_name=FeaturePyramidNetwork
(inner_blocks): RecursiveScriptModule(
original_name=ModuleList
(0): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
(1): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
(2): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
(3): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
)
(layer_blocks): RecursiveScriptModule(
original_name=ModuleList
(0): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
(1): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
(2): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
(3): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
)
)
(extra_blocks): RecursiveScriptModule(original_name=LastLevelMaxPool)
)
)
(rpn): RecursiveScriptModule(
original_name=RegionProposalNetwork
(anchor_generator): RecursiveScriptModule(original_name=AnchorGenerator)
...
...
...
)
)
(roi_heads): RecursiveScriptModule(
original_name=RoIHeads
(box_roi_pool): RecursiveScriptModule(original_name=MultiScaleRoIAlign)
(box_head): RecursiveScriptModule(
original_name=FastRCNNConvFCHead
(0): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
(2): RecursiveScriptModule(original_name=ReLU)
)
(1): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
(2): RecursiveScriptModule(original_name=ReLU)
)
(2): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
(2): RecursiveScriptModule(original_name=ReLU)
)
(3): RecursiveScriptModule(
original_name=Conv2dNormActivation
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=BatchNorm2d)
(2): RecursiveScriptModule(original_name=ReLU)
)
(4): RecursiveScriptModule(original_name=Flatten)
(5): RecursiveScriptModule(original_name=Linear)
(6): RecursiveScriptModule(original_name=ReLU)
)
(box_predictor): RecursiveScriptModule(
original_name=FastRCNNPredictor
(cls_score): RecursiveScriptModule(original_name=Linear)
(bbox_pred): RecursiveScriptModule(original_name=Linear)
)
)
)
When I examine the scripted model code, it looks like loops and hierarchy have been correctly preserved:
print(s.code)
def forward(self,
images: List[Tensor],
targets: Optional[List[Dict[str, Tensor]]]=None) -> Tuple[Dict[str, Tensor], List[Dict[str, Tensor]]]:
_0 = "AssertionError: targets should not be none when in training mode"
_1 = "Expected target boxes to be a tensor of shape [N, 4], got {}."
_2 = "expecting the last two dimensions of the Tensor to be H and W instead got {}"
_3 = "All bounding boxes should have positive height and width. Found invalid box {} for target at index {}."
_4 = "RCNN always returns a (Losses, Detections) tuple in scripting"
training = self.training
if training:
if torch.__is__(targets, None):
ops.prim.RaiseException(_0)
targets1 : Optional[List[Dict[str, Tensor]]] = targets
else:
targets2 = unchecked_cast(List[Dict[str, Tensor]], targets)
for _5 in range(torch.len(targets2)):
target = targets2[_5]
boxes = target["boxes"]
boxes0 = unchecked_cast(Tensor, boxes)
_6 = torch.eq(torch.len(torch.size(boxes0)), 2)
if _6:
_8 = torch.eq((torch.size(boxes0))[-1], 4)
_7 = _8
else:
_7 = False
_9 = torch.format(_1, torch.size(boxes0))
if _7:
pass
else:
_10 = torch.add("AssertionError: ", _9)
ops.prim.RaiseException(_10)
targets1 = targets2
targets0 : Optional[List[Dict[str, Tensor]]] = targets1
else:
targets0 = targets
original_image_sizes = annotate(List[Tuple[int, int]], [])
for _11 in range(torch.len(images)):
img = images[_11]
val = torch.slice(torch.size(img), -2)
_12 = torch.eq(torch.len(val), 2)
_13 = torch.format(_2, torch.slice(torch.size(img), -2))
if _12:
pass
else:
_14 = torch.add("AssertionError: ", _13)
ops.prim.RaiseException(_14)
_15 = torch.append(original_image_sizes, (val[0], val[1]))
transform = self.transform
_16 = (transform).forward(images, targets0, )
images0, targets3, = _16
if torch.__isnot__(targets3, None):
targets5 = unchecked_cast(List[Dict[str, Tensor]], targets3)
_17 = [9223372036854775807, torch.len(targets5)]
for target_idx in range(ops.prim.min(_17)):
target0 = targets5[target_idx]
boxes1 = target0["boxes"]
_18 = torch.slice(torch.slice(boxes1), 1, 2)
_19 = torch.slice(torch.slice(boxes1), 1, None, 2)
degenerate_boxes = torch.le(_18, _19)
if bool(torch.any(degenerate_boxes)):
_20 = torch.where(torch.any(degenerate_boxes, 1))
bb_idx = torch.select(_20[0], 0, 0)
_21 = annotate(List[Optional[Tensor]], [bb_idx])
degen_bb = annotate(List[float], torch.index(boxes1, _21).tolist())
_22 = torch.format(_3, degen_bb, target_idx)
_23 = torch.add("AssertionError: ", _22)
ops.prim.RaiseException(_23)
else:
pass
targets4 : Optional[List[Dict[str, Tensor]]] = targets5
else:
targets4 = targets3
backbone = self.backbone
tensors = images0.tensors
features = (backbone).forward(tensors, )
features0 = unchecked_cast(Dict[str, Tensor], features)
rpn = self.rpn
_24 = (rpn).forward(images0, features0, targets4, )
proposals, proposal_losses, = _24
roi_heads = self.roi_heads
image_sizes = images0.image_sizes
_25 = (roi_heads).forward(features0, proposals, image_sizes, targets4, )
detections, detector_losses, = _25
transform0 = self.transform
image_sizes0 = images0.image_sizes
detections0 = (transform0).postprocess(detections, image_sizes0, original_image_sizes, )
losses = annotate(Dict[str, Tensor], {})
torch.update(losses, detector_losses)
torch.update(losses, proposal_losses)
_has_warned = self._has_warned
if torch.__not__(_has_warned):
torch.warn(_4)
self._has_warned = True
else:
pass
return (losses, detections0)
Whenever I try to load this model architecture into R, my R session crashes without providing traceback for troubleshooting. I am able to save and load the trained model weights successfully in both Python and R, so the issue seems to be the model architecture.
If I try to run the recursive script module for inference in Python for testing, it does not produce any outputs. Iām not sure where to go next in terms of debugging; any pointers or advice would be greatly appreciated!
PyTorch version: 1.12.0 (CPU)
torchvision version: 0.13.0
Platform: Windows 10 Ent x64 (no CUDA)