Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

Issue: I am encountering a kernel dies problem specifically during inference when using a quantized ResNet101 model in PyTorch. The model trains and quantized successfully, but the kernel dies when attempting to run inference on a test image. Here are the key details:

Error Message:

The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click here for more info.
View Jupyter log for further details.

Jupyter Logs: [error] Disposing session as kernel process died ExitCode: undefined, Reason:


  • Model Training: The model was trained with Quantization-Aware Training (QAT) and saved successfully.
  • Model Loading: The quantized model is loaded without any issues.


class QuantizedResNet101Classifier(nn.Module):
    def __init__(self, num_classes, pretrained=True):
        super(QuantizedResNet101Classifier, self).__init__()
        # Quantization stubs
        self.quant = torch.quantization.QuantStub()
        self.dequant = torch.quantization.DeQuantStub()
        # Load pretrained ResNet101
        self.backbone = models.resnet101(pretrained=pretrained)
        # Freeze backbone layers
        for param in self.backbone.parameters():
            param.requires_grad = False
        # Replace final fully connected layer
        num_ftrs = self.backbone.fc.in_features
        self.backbone.fc = nn.Sequential(
            nn.Linear(num_ftrs, 512),
            nn.Linear(512, num_classes)
    def forward(self, x):
        x = self.quant(x)
        x = self.backbone(x)
        x = self.dequant(x)
        return x

def prepare_model_for_qat(model):
    model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')
    torch.quantization.prepare_qat(model, inplace=True)
    return model

# Load the model
model = QuantizedResNet101Classifier(num_classes=5)
model.qconfig = torch.quantization.get_default_qconfig('fbgemm')
prepared_model = torch.quantization.prepare(model)
quantized_model = torch.quantization.convert(prepared_model)
quantized_state_dict = torch.load('trained_quantized_model.pth')

# Test the model
image ="../image.png").convert('RGB')
preprocess = transform = transforms.Compose([
            transforms.Resize((320, 320)),  

input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0).to(device) 

with torch.no_grad():
    output = quantized_model(input_batch)

probabilities = torch.nn.functional.softmax(output, dim=1)[0]
predicted_index = probabilities.argmax().item()
index_to_label = {v: k for k, v in custom_dataset.class_labels.items()}
predicted_label = index_to_label[predicted_index]
print(f"Predicted label: {predicted_label}")


  • PyTorch version: 2.5.1
  • PyTorch cuda version: 12.4
  • Python version: 3.12.7
  • Hardware: NVIDIA GeForce GTX 1650
  • Image Size: Images used are 320x320 RGB.

Debugging Attempts:

  • Verified the loaded state dictionary matches the model architecture.
  • Checked that the model can run in evaluation mode without quantization.


  1. What could cause the kernel to crash during inference with the quantized model?
  2. Are there any specific debugging steps or configurations I should check for quantized inference?

Any insights or suggestions would be greatly appreciated!

Run your workload in a terminal, which might provide better error messages than a Jupyter notebook.

Sure. Let me try that as well. Thanks.