Nick95
January 26, 2023, 6:51pm
1
Hi, I’m trying to quantize a trained LSTM autoencoder.
Once reloaded the best model weights I used this call here to quantize the model:
torch.quantization.quantize_dynamic(model, {nn.LSTM}, dtype=torch.qint8)
However, I get this error:
RuntimeError: bias should be a vector (1D Tensor)
The model definition is quite simple:
class AE(nn.Module):
def __init__(self, lstm1_h: int, batch_size: int):
super().__init__()
self.init_hidden(batch_size)
self.encoder = nn.LSTM(input_size=1, hidden_size=lstm1_h, batch_first=True)
self.decoder = nn.LSTM(input_size=1, hidden_size=lstm1_h, proj_size=1, batch_first=True)
# initialize weights
self.init_weights()
def init_hidden(self, batch_size: int):
self.h0 = torch.zeros(1, batch_size, 1)
def init_weights(self):
# set input weight and bias to force 1 as input to decoder
self.decoder.bias_ih_l0.data.fill_(1).requires_grad_(False)
self.decoder.weight_ih_l0.data.zero_().requires_grad_(False)
How can I solve this error?
jerryzh168
(Jerry Zhang)
January 26, 2023, 8:22pm
2
can you narrow done the problem by commenting out the modifications? e.g. what would happen if you don’t do init_weights
?
Nick95
January 27, 2023, 8:03pm
3
Thanks, I tried a couple of stuff:
I tried to load that model from scratch (without using the saved version) and it works both with and without calling init_weigths
I tried to load the pre-trained model directly from the pth file and both with and without calling init_weigths it gives back again the old error.
It looks like there is something in the trained version that does not works good. Any idea? I must use the trained weights
Nick95
January 29, 2023, 2:04am
4
I managed to solve the issue by manually changing the size of the bias vector to make it a 1D Tensor.
I have then exported the quantized model as pt file and I have imported it into libtorch, however I have not noticed any difference in the inference time. Is there something I should care of?
Nick95
February 2, 2023, 7:57pm
5
Looking at the quantized model I have looked at something strange. I quantized the model with:
model_q = torch.quantization.quantize_dynamic(model, {torch.nn.LSTM, torch.nn.Linear}, dtype=torch.qint8)
But:
if I print the quantized model I get:
print("Post quantization: ", model_q)
// Output
Post quantization: LSTMPacketAE(
(encoder): DynamicQuantizedLSTM(1, 512, batch_first=True)
(decoder): DynamicQuantizedLSTM(1, 512, batch_first=True)
)
Which seems to be quite different from the output shown in the tutorial .
If then I print the named parameters I get an empty vector, while before quantization there were lots of parameters:
print(list(model_q.named_parameters()))
// Output
[]
the quantize model expects now a 3D input, while the original one expected 2D inputs
Anyone has some idea?