Kernel Dies When Testing a Quantized ResNet101 Model in PyTorch

Issue: I am encountering a kernel dies problem specifically during inference when using a quantized ResNet101 model in PyTorch. The model trains and quantized successfully, but the kernel dies when attempting to run inference on a test image. Here are the key details:

Error Message:

The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click here for more info.
View Jupyter log for further details.

Jupyter Logs: [error] Disposing session as kernel process died ExitCode: undefined, Reason:

Observations:

  • Model Training: The model was trained with Quantization-Aware Training (QAT) and saved successfully.
  • Model Loading: The quantized model is loaded without any issues.

Code:

class QuantizedResNet101Classifier(nn.Module):
    def __init__(self, num_classes, pretrained=True):
        super(QuantizedResNet101Classifier, self).__init__()
        
        # Quantization stubs
        self.quant = torch.quantization.QuantStub()
        self.dequant = torch.quantization.DeQuantStub()
        
        # Load pretrained ResNet101
        self.backbone = models.resnet101(pretrained=pretrained)
        
        # Freeze backbone layers
        for param in self.backbone.parameters():
            param.requires_grad = False
        
        # Replace final fully connected layer
        num_ftrs = self.backbone.fc.in_features
        self.backbone.fc = nn.Sequential(
            nn.Linear(num_ftrs, 512),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(512, num_classes)
        )
    
    def forward(self, x):
        x = self.quant(x)
        x = self.backbone(x)
        x = self.dequant(x)
        return x

def prepare_model_for_qat(model):
    model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')
    torch.quantization.prepare_qat(model, inplace=True)
    return model

# Load the model
model = QuantizedResNet101Classifier(num_classes=5)
model.eval()
model.qconfig = torch.quantization.get_default_qconfig('fbgemm')
prepared_model = torch.quantization.prepare(model)
quantized_model = torch.quantization.convert(prepared_model)
quantized_state_dict = torch.load('trained_quantized_model.pth')
quantized_model.load_state_dict(quantized_state_dict)

# Test the model
image = Image.open("../image.png").convert('RGB')
preprocess = transform = transforms.Compose([
            transforms.Resize((320, 320)),  
            transforms.ToTensor(),
])

input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0).to(device) 
quantized_model.eval()

with torch.no_grad():
    output = quantized_model(input_batch)

probabilities = torch.nn.functional.softmax(output, dim=1)[0]
predicted_index = probabilities.argmax().item()
index_to_label = {v: k for k, v in custom_dataset.class_labels.items()}
predicted_label = index_to_label[predicted_index]
print(f"Predicted label: {predicted_label}")

Environment:

  • PyTorch version: 2.5.1
  • PyTorch cuda version: 12.4
  • Python version: 3.12.7
  • Hardware: NVIDIA GeForce GTX 1650
  • Image Size: Images used are 320x320 RGB.

Debugging Attempts:

  • Verified the loaded state dictionary matches the model architecture.
  • Checked that the model can run in evaluation mode without quantization.

Questions:

  1. What could cause the kernel to crash during inference with the quantized model?
  2. Are there any specific debugging steps or configurations I should check for quantized inference?

Any insights or suggestions would be greatly appreciated!

Run your workload in a terminal, which might provide better error messages than a Jupyter notebook.

Sure. Let me try that as well. Thanks.