Hi,
I seek your guidance in evaluating the accuracy of my models.
Background
I have pretrained two distinct models:
- PatchTST (trained from scratch)
- DenseNet-121 (using contrastive learning)
Subsequently, I fine-tuned both models using BitFit and quantization techniques. However, I observed that the Mean Absolute Error (MAE) did not increase as expected—these techniques do reduce the accuracy (MAE in my case - MAE should have been increased).
The bitfit code that I applied is as follows:
for name, param in model.named_parameters():
if ‘bias’ not in name:
param.requires_grad = False # Freeze non-bias parameters
else:
print(f"Tuning: {name}") # Confirm bias parameters are trainable
and quantization code is as follows:
learn.load(weight_path)
# Move model to CPU (important!)
learn.model.to('cpu')
# Set to eval mode before quantizing (required)
learn.model.eval()
original_size = get_model_size(learn.model)
print(f"Original model size: {original_size:.2f} MB")
print(f"Trainable parameters of original model: {count_parameters(learn.model)}")
# Apply quantization
learn.model = torch.quantization.quantize_dynamic(
learn.model, {torch.nn.Linear}, dtype=torch.qint8
)
Could you help me identify potential reasons why the MAE did not decrease, is my implementation correct?