Inference with a custom model

Hi there! I want to know if somone could help me:

I have a pretrained linear encoder that i would like to add before my real model but I dont know how to do it.
Let me explain a little bit better:

  • I have trained a encoder NN with my dataset and I have saved the parameters with torch.save(model.state_dict(), ‘guada_withvalid_DNN.pt’).
  • Then in another ‘.py’ I have the new scenario with a model defined like this:
# neural network architecture definition
class WifiRNN(nn.Module):
    def __init__(self, i_size, h_size, n_layers, num_classes):
        super(WifiRNN, self).__init__()
        self.input_size = i_size
        self.hidden_size = h_size
        self.num_layers = n_layers
        self.num_classes = num_classes
 
        self.wifi_rnn = nn.RNN(input_size=self.input_size, hidden_size=self.hidden_size, num_layers=self.num_layers, batch_first=True)
        self.out = nn.Linear(in_features=self.hidden_size * sequence_length, out_features=self.num_classes)

    def forward(self, x_in, h_state):
        r_out, h_state = self.wifi_rnn(x_in, h_state)
        r_out = r_out.reshape(r_out.shape[0], -1)
        out = self.out(r_out)
        return out, h_state

    def init_hidden_state(self, b_size):
        h0 = torch.zeros(self.num_layers, b_size, self.hidden_size).to(device)
        return h0
  • I want to import the pretrained encoder into my new escenario to add it before RNN so first I feed the data to the pretrained encoder with frozen parameters and then I feed the RNN and aply backpropagation.

I’ve been trying to do:

model_encoder = nn.Module
# model_encoder = torch.load('guada_withvalid_DNN.pt')
model_encoder.load_state_dict(torch.load('guada_withvalid_DNN.pt'))

however i get:

Traceback (most recent call last):
  File "/snap/pycharm-community/232/plugins/python-ce/helpers/pydev/pydevd.py", line 1477, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/snap/pycharm-community/232/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/lauram/PycharmProjects/RNN/RNN_preEncoder.py", line 146, in <module>
    model_encoder.load_state_dict(torch.load('guada_withvalid_DNN'))
TypeError: load_state_dict() missing 1 required positional argument: 'state_dict'
python-BaseException

Can anybody help me? The only examples I have seen are with pretrained models from torchvision and that’s not what I am trying to do.
Thanks!

There are different ways to save the parameters and each has its own advantage. I havenot used all of them so I cannot explain about all of them. However, I use

torch.save(model.state_dict(), 'model.pth.tar')

To save the code and

model.load_state_dict(torch.load("model.pth.tar"))

To load the parameters

maybe .pt has its own way to load or is incorrect format(I am not sure).
If you can retrain the model then use the given code to save and load the model. I hope it will solve your problem.

In your code snippet it seems you are creating model_encoder as the nn.Module class, not an object of your actual model definition.
To properly load it, you would need to create the object first and then call .load_state_dict() on it:

model_encoder = MyModelEncoder(my_arguments)
model_encoder.load_state_dict(torch.load(...))

Thanks for the reply!! But it didn’t solve my problem :pensive: The ‘.pt’ is the recomended format for saving a model in pytorch, that’s why I use that one instead of another one.

Thanks for answering! I’ve done what you tell me and I thought of using my two models like follows:

class WifiDNN(nn.Module):
    input_size = input_size
    output_size = num_positions
    input_channels = 1
    fcl1_size = 200
    fcl2_size = 200
    fcl3_size = 200

    def __init__(self):
        super(WifiDNN, self).__init__()
        # Define the fully connected layers
        self.fcl1 = nn.Linear(self.input_size, self.fcl1_size)
        self.fcl2 = nn.Linear(self.fcl1_size, self.fcl2_size)
        self.fcl3 = nn.Linear(self.fcl2_size, self.fcl3_size)
        self.output = nn.Linear(self.fcl3_size, self.output_size)
        self.dropout = nn.Dropout(dropout)

    def forward(self, x):
        # Fully connected layers
        x = self.fcl1(x)
        x = self.fcl2(x)
        x = self.dropout(x)
        x = self.fcl3(x)
        x = self.output(x)

        return x

encoder_net = WifiDNN()
encoder_net.load_state_dict(torch.load('guada_withvalid_DNN'))

class WifiRNN(nn.Module):
    def __init__(self, i_size, h_size, n_layers, num_classes):
        super(WifiRNN, self).__init__()
        self.input_size = i_size
        self.hidden_size = h_size
        self.num_layers = n_layers
        self.num_classes = num_classes

        self.wifi_rnn = nn.RNN(input_size=self.num_classes, hidden_size=self.hidden_size, num_layers=self.num_layers,
                               batch_first=True)
        self.out = nn.Linear(in_features=self.hidden_size * sequence_length, out_features=self.num_classes)

    def forward(self, x_in, h_state):
        r_out, h_state = self.wifi_rnn(x_in, h_state)
        r_out = r_out.reshape(r_out.shape[0], -1)
        out = self.out(r_out)
        return out, h_state

    def init_hidden_state(self, b_size):
        h0 = torch.zeros(self.num_layers, b_size, self.hidden_size).to(device)
        return h0


# neural network architecture definition
class WifiNNMix(nn.Module):
    def __init__(self, model_trained, model_rnn, seq_len, cols):
        super(WifiNNMix, self).__init__()
        self.encoder = model_trained
        self.rnnNet = model_rnn
        self.seq_len = seq_len
        self.col = cols

    def forward(self, x_in, state):
        out_1 = self.encoder(x_in)
        out_1 = out_1.reshape(-1, self.seq_len, self.col)
        out_2, state = self.rnnNet(out_1, state)
        return out_2, state

    def init_hidden_state(self, b_size):
        h0 = torch.zeros(self.num_layers, b_size, self.hidden_size).to(device)
        return h0

rnn = WifiRNN(input_size, hidden_size, num_layers, num_positions).to(device)
model = WifiNNMix(encoder_net, rnn, sequence_length,num_positions).to(device)
print(model)

being the output size of the encoder the input size of the RNN, is this idea correct? After that I froze the encoder parameter so the backprop does not actualize them. Is that correct too?

for param in model.encoder.parameters():
    param.requires_grad = False

Thanks a lot!!

Yes, your code and idea look correct.

Yes, this looks also correct. You could verify it quickly by running a single update (e.g. using random inputs and targets) and check the gradients of all parameters.
The .grad attributes of all parameters of model.encoder should be None:

for param in model.encoder.parameters():
    print(param.grad)
1 Like

Thanks a lot!!! That solved my problem

1 Like