Loading Pytorch model in cpu - Attribute no attribute 'state_dict'

kurianbenoy · July 3, 2019, 6:51am

import torch

device = torch.device('cpu')
model = torch.load('models/mb2-ssd-lite-mp-0_686.pth', map_location=device)
#model.to(device)
print("Model's state_dict:")
for param_tensor in model.state_dict():
    print(param_tensor, "\t", model.state_dict()[param_tensor].size())

This is my code and I am thrown a attribute error as show:

for param_tensor in model.state_dict():

AttributeError: ‘collections.OrderedDict’ object has no attribute ‘state_dict’

ptrblck · July 3, 2019, 2:34pm

The path probably points to the state_dict, not the model, which is the usual workflow.
If that’s the case, you should create a model instance using its definition:

class MyModel(nn.Module):
    def __init__(self, ...
        ...

model = MyModel()
model.load_state_dict(torch.load('models/...'))

kurianbenoy · July 4, 2019, 5:20am

@ptrblck I made it to the current format as mentioned by you.

THe code can be found here: https://paste.debian.net/1090245/

Since I am in a CPU device, I am thrown the following error:

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location='cpu' to map your storages to the CPU.

ptrblck · July 4, 2019, 10:08am

Could you add map_location='cpu' to torch.load as mentioned in the error message?

torch.load('models/...', map_location='cpu')

dachosen1 · May 7, 2020, 7:34am

hey @ptrblck I’m getting the same errors:

LSTM.AudioLSTM.load_state_dict(torch.load(PATH))
TypeError: load_state_dict() missing 1 required positional argument: ‘state_dict’

but my PATH directly links to the model?

PATH
‘C:\Users\ander\OneDrive\Projects\Common-Voice\model\trained_model\model_gender-0.0.1.pth’

ptrblck · May 7, 2020, 7:34am

What type is returned from torch.load(PATH)?
Did you store the state_dict or the model directly?

dachosen1 · May 7, 2020, 7:35am

i stored the state dict using the below:

torch.save(
    train_model.state_dict(),
    os.path.join(TRAINED_MODEL_DIR, MODEL_NAME + __version__)
)

ptrblck · May 7, 2020, 7:38am

Could you nevertheless check the print(type(torch.load(PATH)))?

dachosen1 · May 7, 2020, 7:39am

sure:

print(type(torch.load(PATH)))
<class ‘collections.OrderedDict’>

ptrblck · May 7, 2020, 7:41am

That seems to be correct and I’m not sure why this error is raised.
Could you post the model definition of AudioLSTM?

dachosen1 · May 7, 2020, 7:43am

Here’s the full model:

class AudioLSTM(nn.Module):
    """
    LSTM for audio classification
    """

    def __init__(
        self,
        input_size: int,
        hidden_size: int,
        dropout: float,
        num_layer: int,
        output_size: int,
        batch: bool = True,
        bidirectional: bool = True,
        RNN_TYPE: str = "LSTM",
    ) -> None:
        """
        :param input_size: The number of expected features in the input x
        :param hidden_size:The number of features in the hidden state h
        :param num_layer: Number of recurrent layers. E.g., setting num_layers=2 would mean stacking two LSTMs together
        to form a stacked LSTM, with the second LSTM taking in outputs of the first LSTM and computing the final results.
        :param dropout:f non-zero, introduces a Dropout layer on the outputs of each LSTM layer except the last layer,
        with dropout probability equal to dropout
        :param output_size: Number of label prediction
        :param batch If True, then the input and output tensors are provided as (batch, seq, feature)
        :param bidirectional: If True, becomes a bidirectional LSTM
        :param RNN_TYPE: Specify the type of RNN. Input takes two options LSTM and GRU
        """
        super(AudioLSTM, self).__init__()

        self.input_size = input_size
        self.hidden_size = hidden_size
        self.num_layer = num_layer
        self.dropout = dropout

        if RNN_TYPE == "LSTM":
            self.RNN_TYPE = nn.LSTM(
                input_size=input_size,
                hidden_size=hidden_size,
                num_layers=num_layer,
                dropout=dropout,
                batch_first=batch,
                bidirectional=bidirectional,
            )

        if RNN_TYPE == "GRU":
            self.RNN_TYPE = nn.GRU(
                input_size=input_size,
                hidden_size=hidden_size,
                num_layers=num_layer,
                dropout=dropout,
                batch_first=batch,
                bidirectional=bidirectional,
            )

        # dropout layer
        self.dropout = nn.Dropout(dropout)

        # linear and sigmoid layers
        self.fc = nn.Linear(hidden_size, output_size)
        self.out = nn.Sigmoid()

    def forward(self, x, hidden):
        """

        :param x:
        :param hidden: :
        :return:
        """

        batch_size = x.size(0)
        seq_count = x.shape[1]
        x = x.float().view(1, -1, seq_count)

        lstm_out, hidden = self.RNN_TYPE(x)
        lstm_out = lstm_out.contiguous().view(-1, self.hidden_size)

        # dropout and fully-connected layer
        out = self.dropout(lstm_out)
        out = self.fc(out)

        # todo: utilize hidden parameter

        sig_out = self.out(out)

        # reshape to be batch_size first
        sig_out = sig_out.view(batch_size, -1)
        sig_out = sig_out[:, -1]  # get last batch of labels

        return sig_out, hidden

    def init_hidden(self, batch_size: int):
        """
        Initializes hidden state. Create two new tensors with sizes n_layers x batch_size x hidden_dim, initialized to
        zero, for hidden state and cell state of LSTM.

        :return: 
        :rtype: 
        :return: 
        :rtype: 
        :param batch_size: number of batches

        :return: Tensor for hidden states
        """

        weight = next(self.parameters()).data

        if torch.cuda.is_available():
            hidden = (
                weight.new(self.num_layer, batch_size, self.hidden_size).zero_().cuda(),
                weight.new(self.num_layer, batch_size, self.hidden_size).zero_().cuda(),
            )

        else:
            hidden = (
                weight.new(self.num_layer, batch_size, self.hidden_size).zero_(),
                weight.new(self.num_layer, batch_size, self.hidden_size).zero_(),
            )

    return hidden

ptrblck · May 7, 2020, 7:46am

Thanks, the code works for me:

model = AudioLSTM(1, 1, 0, 1, 1)
sd = model.state_dict()
torch.save(sd, 'tmp.pt')

model = AudioLSTM(1, 1, 0, 1, 1)
model.load_state_dict(torch.load('tmp.pt'))
> <All keys matched successfully>

dachosen1 · May 7, 2020, 7:48am

I see where the error is. I need to pass the parameters again.Thanks

dachosen1 · May 14, 2020, 6:57am

def load_pipeline(MODEL_NAME: str) -> object:

    model = LSTM.AudioLSTM(
        num_layer=config.Model.NUM_LAYERS,
        input_size=config.Model.INPUT_SIZE,
        hidden_size=config.Model.HIDDEN_DIM,
        output_size=config.Model.OUTPUT_SIZE,
        dropout=config.Model.DROPOUT,
    )
    file_path = os.path.join(config.TRAINED_MODEL_DIR, MODEL_NAME + _version + '.pth')
    trained_model = model.load_state_dict(torch.load(file_path))
    return trained_model

The model successfully loaded,

model = load_pipeline(NAME)
model
<All keys matched successfully>

However model eval returns an error.


model.eval()
Traceback (most recent call last):
  File "<input>", line 1, in <module>
AttributeError: '_IncompatibleKeys' object has no attribute 'eval'

ptrblck · May 14, 2020, 6:58am

Don’t use the return value of load_state_dict, but return the model directly.
The return value is a message regarding missing or incompatible keys.

dachosen1 · May 14, 2020, 7:18am

Thanks for the help. I got it to work.

def load_pipeline(MODEL_NAME: str) -> object:

    """
    Load a saved PyTorch model
    :param MODEL_NAME:  Name of the model to parse
    :return:
    """

    model = LSTM.AudioLSTM(
        num_layer=config.Model.NUM_LAYERS,
        input_size=config.Model.INPUT_SIZE,
        hidden_size=config.Model.HIDDEN_DIM,
        output_size=config.Model.OUTPUT_SIZE,
        dropout=config.Model.DROPOUT,
    )
    model.eval()
    file_path = os.path.join(config.TRAINED_MODEL_DIR, MODEL_NAME + _version + ".pt")
    return model, file_path

model, path  = load_pipeline(NAME)

model.load_state_dict(torch.load(path))
<All keys matched successfully>
model.eval()
AudioLSTM(
  (RNN_TYPE): LSTM(512, 1, num_layers=3, batch_first=True, dropout=0.3, bidirectional=True)
  (dropout): Dropout(p=0.3, inplace=False)
  (fc): Linear(in_features=1, out_features=2, bias=True)
  (out): Sigmoid()
)