Error quantizing LSTM autoencoder

Nick95 · January 26, 2023, 6:51pm

Hi, I’m trying to quantize a trained LSTM autoencoder.
Once reloaded the best model weights I used this call here to quantize the model:

torch.quantization.quantize_dynamic(model, {nn.LSTM}, dtype=torch.qint8)

However, I get this error:

RuntimeError: bias should be a vector (1D Tensor)

The model definition is quite simple:

class AE(nn.Module):

    def __init__(self, lstm1_h: int, batch_size: int):
        super().__init__()

        self.init_hidden(batch_size)

        self.encoder = nn.LSTM(input_size=1, hidden_size=lstm1_h, batch_first=True)
        self.decoder = nn.LSTM(input_size=1, hidden_size=lstm1_h, proj_size=1, batch_first=True)

        # initialize weights
        self.init_weights()

    def init_hidden(self, batch_size: int):
        self.h0 = torch.zeros(1, batch_size, 1)

    def init_weights(self):
        # set input weight and bias to force 1 as input to decoder
        self.decoder.bias_ih_l0.data.fill_(1).requires_grad_(False)
        self.decoder.weight_ih_l0.data.zero_().requires_grad_(False)

How can I solve this error?

jerryzh168 · January 26, 2023, 8:22pm

can you narrow done the problem by commenting out the modifications? e.g. what would happen if you don’t do init_weights?

Nick95 · January 27, 2023, 8:03pm

Thanks, I tried a couple of stuff:

I tried to load that model from scratch (without using the saved version) and it works both with and without calling init_weigths
I tried to load the pre-trained model directly from the pth file and both with and without calling init_weigths it gives back again the old error.
It looks like there is something in the trained version that does not works good. Any idea? I must use the trained weights

Nick95 · January 29, 2023, 2:04am

I managed to solve the issue by manually changing the size of the bias vector to make it a 1D Tensor.
I have then exported the quantized model as pt file and I have imported it into libtorch, however I have not noticed any difference in the inference time. Is there something I should care of?

Nick95 · February 2, 2023, 7:57pm

Looking at the quantized model I have looked at something strange. I quantized the model with:

model_q = torch.quantization.quantize_dynamic(model, {torch.nn.LSTM, torch.nn.Linear}, dtype=torch.qint8)

But:

if I print the quantized model I get:

print("Post quantization: ", model_q)

// Output
Post quantization:  LSTMPacketAE(
  (encoder): DynamicQuantizedLSTM(1, 512, batch_first=True)
  (decoder): DynamicQuantizedLSTM(1, 512, batch_first=True)
)

Which seems to be quite different from the output shown in the tutorial.

If then I print the named parameters I get an empty vector, while before quantization there were lots of parameters:

print(list(model_q.named_parameters()))

// Output
[]

the quantize model expects now a 3D input, while the original one expected 2D inputs

Anyone has some idea?