Issue: I am encountering a kernel dies problem specifically during inference when using a quantized ResNet101 model in PyTorch. The model trains and quantized successfully, but the kernel dies when attempting to run inference on a test image. Here are the key details:
Error Message:
The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click here for more info.
View Jupyter log for further details.
Jupyter Logs: [error] Disposing session as kernel process died ExitCode: undefined, Reason:
Observations:
- Model Training: The model was trained with Quantization-Aware Training (QAT) and saved successfully.
- Model Loading: The quantized model is loaded without any issues.
Code:
class QuantizedResNet101Classifier(nn.Module):
def __init__(self, num_classes, pretrained=True):
super(QuantizedResNet101Classifier, self).__init__()
# Quantization stubs
self.quant = torch.quantization.QuantStub()
self.dequant = torch.quantization.DeQuantStub()
# Load pretrained ResNet101
self.backbone = models.resnet101(pretrained=pretrained)
# Freeze backbone layers
for param in self.backbone.parameters():
param.requires_grad = False
# Replace final fully connected layer
num_ftrs = self.backbone.fc.in_features
self.backbone.fc = nn.Sequential(
nn.Linear(num_ftrs, 512),
nn.ReLU(),
nn.Dropout(0.3),
nn.Linear(512, num_classes)
)
def forward(self, x):
x = self.quant(x)
x = self.backbone(x)
x = self.dequant(x)
return x
def prepare_model_for_qat(model):
model.qconfig = torch.quantization.get_default_qat_qconfig('fbgemm')
torch.quantization.prepare_qat(model, inplace=True)
return model
# Load the model
model = QuantizedResNet101Classifier(num_classes=5)
model.eval()
model.qconfig = torch.quantization.get_default_qconfig('fbgemm')
prepared_model = torch.quantization.prepare(model)
quantized_model = torch.quantization.convert(prepared_model)
quantized_state_dict = torch.load('trained_quantized_model.pth')
quantized_model.load_state_dict(quantized_state_dict)
# Test the model
image = Image.open("../image.png").convert('RGB')
preprocess = transform = transforms.Compose([
transforms.Resize((320, 320)),
transforms.ToTensor(),
])
input_tensor = preprocess(image)
input_batch = input_tensor.unsqueeze(0).to(device)
quantized_model.eval()
with torch.no_grad():
output = quantized_model(input_batch)
probabilities = torch.nn.functional.softmax(output, dim=1)[0]
predicted_index = probabilities.argmax().item()
index_to_label = {v: k for k, v in custom_dataset.class_labels.items()}
predicted_label = index_to_label[predicted_index]
print(f"Predicted label: {predicted_label}")
Environment:
- PyTorch version: 2.5.1
- PyTorch cuda version: 12.4
- Python version: 3.12.7
- Hardware: NVIDIA GeForce GTX 1650
- Image Size: Images used are 320x320 RGB.
Debugging Attempts:
- Verified the loaded state dictionary matches the model architecture.
- Checked that the model can run in evaluation mode without quantization.
Questions:
- What could cause the kernel to crash during inference with the quantized model?
- Are there any specific debugging steps or configurations I should check for quantized inference?
Any insights or suggestions would be greatly appreciated!