Missing key(s) in state_dict

I tried converting a pytorch model into deployment (ios) so I’m focused on converting a model to either cafffe or Onnx (TypeError: forward() missing 2 required positional arguments: 'cap_lens' and 'hidden' - #8 by rchavezj). My initial error was only having 'one layer" inside an lstm yet I encountered another problem.

I tried implementing two layers (nlayers=2) by instantiating a new rnn object called text_encoder

Yet I’m given an error of some key(s) inside the state_dict is missing. This error doesn’t occur for having one layer yet but I do get an error during the conversion (TypeError: forward() missing 2 required positional arguments: 'cap_lens' and 'hidden' - #8 by rchavezj)

I’m not sure if this is happening because the model I loaded only had one layer or another problem. How can I recover missing key’s while adding a new layer? Or is that impossible?

If I understand your issue correctly, you are creating a two-layer RNN while loading a single layer state dict?
Do you want to initialize the second layer randomly while loading the parameters for the first one?

Yes that is correct. I forgot to mention this but my main error is the following “TypeError: forward() missing 2 required positional arguments: 'cap_lens' and 'hidden' - #2 by Diego”.

forwardFunction

There are two cases I’m looking as the root of the problem

(1st Case) Diego from the other post (TypeError: forward() missing 2 required positional arguments: 'cap_lens' and 'hidden' - #2 by Diego) said “I think the problem is, you are using dropout with only one layer. You need at least 2 layers to apply dropout if you are using the LSTM class (torch.nn — PyTorch 2.1 documentation).”


I wasn’t sure if this was a big deal because I was only given a warning when I created one layer and the error doesn’t specify any complaints about it. After adding a second layer I started missing key(s) inside the state_dict. Your solution to initialize a second layer randomly while loading the parameters for the first sounds awesome but I want your opinion on (Case 2) if we really need to solve (Case 1). I’m mostly concern the root of the problem. Sorry for not mentioning (Case 2) until now.

(2nd Error): If I stick with one layer I successfully load the state_dict with no loss of key(s). Only given a warning for creating one layer for an LSTM. However I think the main root of the problem is a lack of arguments passed into text_encoder not having cap_lens and hidden for the forward function. This case is a lot more extreme since I don’t know the true origins or the two variables. I’m using this git repo (GitHub - taoxugit/AttnGAN 1) for cap_lens and the hidden variable. Their located inside AttnGAN/code/pretrain_DAMSM.py @line_65

cap_lens

however they were generated data (prepare_data) from the class AttnGAN/code/datasets.py @line_28

I try to replicate prepare_data to create a new cap_lens but I keep ending up with empty content for the data.

It looks like the cap_length are created in the TextDataset’s get_caption method.
I think it’s worth trying to fix this problem first.

Just wanted to make sure. Your saying cap_length is the same as cap_lens correct?

Based on the code, it looks like in get_caption x_len is calculated, then returned to __getitem__ as cap_len.
prepare_data gets a new sample from TextDataset (so from its __getitem__), and returns sorted_cap_lens, which is finally renamed to cap_lens.

I see your confusion and think the naming in the repo could be a bit more consistent, but maybe there is a good reason to rename the same variables.

1 Like

Link to my conversion (ConvertML_Models/convert.ipynb at master · rchavezj/ConvertML_Models · GitHub)

So I tried using the prepare_data function and it looks like my cap_lens is getting a new matrix of data from the dataloader. I’m having confusion wrapping my head around why the hidden matrix keeps return nothing but zero. Either one of two cases comes to my head

  1. The pre-trained loaded model doesn’t have hidden content
  2. The way I’m loading the hidden decisions disapear.

At least now I’m getting an error that looks reasonable. When I try to create a fake inputDimension and feed it into text_encoder to perform coreML conversion, I get an error with argument 1 not having proper data.

The initial hidden state might be all zeros, so I don’t think it’s a bug.
I haven’t compared your code to the other code base, but this line of code seems to confirm by assumption.

The error message states that indices should be provided as torch.long.
Could you try to cast x using x = x.long()?

1 Like

It looks like there’s something wrong with the input dimensions since I’m getting a message that I’m out of bound based on this forum post (Embeddings index out of range error)

I honestly thought the first layer from the bottom picture was the required dimension. Unless I need to make my random torch input into some sort of embedding format that I’m not self aware of.

inputDimen

The Embedding layer works a bit different than e.g. Linear.
You specify the num_embeddings, i.e. the size of the dictionary, and the embedding_dim, i.e. the size of each embedding vector.

In your case num_embeddings=27297 means, that your input tensor should store indices in the rage [0, 27296].

Have a look at this small example:

num_embeddings = 27297
embedding_dim = 300
emb = nn.Embedding(
    num_embeddings=num_embeddings,
    embedding_dim=embedding_dim
)

batch_size = 100
x = torch.empty(batch_size, dtype=torch.long).random_(num_embeddings)

output = emb(x)
output.shape
1 Like

I grew up visually learning concepts. Taking your advice does that mean I’m given a one dimensional (1x300) input then embedded to look like (27297 x 300) compared to the diagram I found on the internet (S x B x I)? The example I found looks like 3 dimensional. Does B = num_embeddings and S = embedding_dim?

nn.Embedding basically keeps your input dimensions and adds the embedding_dim to it.
So if you provide a two-dimensional input, you will get a three-dimensional output.
I’m not familiar with your example, but it looks like you transpose your input to get the dimensions [sequence, batch_size] and pass it into the embedding layer.
I would therefore correspond to the embedding_dim.

1 Like

Okay this is my overview of everything I’ve learned from converting a pytorch model to ONNX. Before I converted the pytorch model I wanted to make sure the dimensions for captions, cap_lens and hidden were correct through the forward function and no errors! :slight_smile:

.

However I have a new problem…I get an error from exporting the model using the exact same inputs???

“TypeError: wrapPyFuncWithSymbolic(): incompatible function arguments. The following argument types are supported: (self: torch._C.Graph, arg0: function, arg1: List[torch::jit::Value], arg2: int, arg3: function) → iterator”.

I tried tupling all 3 inputs (captions, cap_lens, hidden) onto the onnx converter yet I get some sort of data type error…Before showing the output terminal from the conversion I want to show how all three inputs look like. I came to a conclusion I need to either convert all three inputs into float or long dtype and idk how to properly convert dtypes.

caption is a (48,15) with torch.LongTensor data type
captions

cap_lens is (48,) with torch.LongTensor data type
cap_lens

and lastly hidden is a tuple of two (2, 48, 128) with torch.FloatTensor datatype
hidden

# Export the model
torch_out = torch.onnx._export(text_encoder,                 # model being run
                               (captions_fake_input, cap_lens, hidden), # model input (or a tuple for multiple inputs)
                               "kol.onnx",      # where to save the model (can be a file or file-like object)
                               export_params=True)           # store the trained parameter weights inside the model file

[output]

TypeError Traceback (most recent call last)
in ()
3 (captions_fake_input, cap_lens, hidden), # model input (or a tuple for multiple inputs)
4 “kol.onnx”, # where to save the model (can be a file or file-like object)
----> 5 export_params=True) # store the trained parameter weights inside the model file

~/anaconda/lib/python3.6/site-packages/torch/onnx/init.py in _export(*args, **kwargs)
18 def _export(*args, **kwargs):
19 from torch.onnx import utils
—> 20 return utils._export(*args, **kwargs)
21
22

~/anaconda/lib/python3.6/site-packages/torch/onnx/utils.py in _export(model, args, f, export_params, verbose, training, input_names, output_names, aten, export_type)
132 # training mode was.)
133 with set_training(model, training):
→ 134 trace, torch_out = torch.jit.get_trace_graph(model, args)
135
136 if orig_state_dict_keys != _unique_state_dict(model).keys():

~/anaconda/lib/python3.6/site-packages/torch/jit/init.py in get_trace_graph(f, args, kwargs, nderivs)
253 if not isinstance(args, tuple):
254 args = (args,)
→ 255 return LegacyTracedModule(f, nderivs=nderivs)(*args, **kwargs)
256
257

~/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
489 result = self._slow_forward(*input, **kwargs)
490 else:
→ 491 result = self.forward(*input, **kwargs)
492 for hook in self._forward_hooks.values():
493 hook_result = hook(self, input, result)

~/anaconda/lib/python3.6/site-packages/torch/jit/init.py in forward(self, *args)
286 _tracing = True
287 trace_inputs = _unflatten(all_trace_inputs[:len(in_vars)], in_desc)
→ 288 out = self.inner(*trace_inputs)
289 out_vars, _ = _flatten(out)
290 _tracing = False

~/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in call(self, *input, **kwargs)
487 hook(self, input)
488 if torch.jit._tracing:
→ 489 result = self._slow_forward(*input, **kwargs)
490 else:
491 result = self.forward(*input, **kwargs)

~/anaconda/lib/python3.6/site-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
477 tracing_state._traced_module_stack.append(self)
478 try:
→ 479 result = self.forward(*input, **kwargs)
480 finally:
481 tracing_state.pop_scope()

~/Desktop/text-to-image-transcribed/code/model.py in forward(self, captions, cap_lens, hidden, mask)
153
154 #print("======= Packed emb ====== ")
→ 155 emb = pack_padded_sequence(emb, cap_lens, batch_first=True)
156 #print("emb: ", emb)
157 #print("emb shape: ", emb.shape)

~/anaconda/lib/python3.6/site-packages/torch/onnx/init.py in wrapper(*args, **kwargs)
71
72 symbolic_args = function._unflatten(arg_values, args)
—> 73 output_vals = symbolic_fn(tstate.graph(), *symbolic_args, **kwargs)
74
75 for var, val in zip(

~/anaconda/lib/python3.6/site-packages/torch/nn/utils/rnn.py in _symbolic_pack_padded_sequence(g, input, lengths, batch_first, padding_value, total_length)
144 outputs = g.wrapPyFuncWithSymbolic(
145 pack_padded_sequence_trace_wrapper, [input, lengths], 2,
→ 146 _onnx_symbolic_pack_padded_sequence)
147 return tuple(o for o in outputs)
148

TypeError: wrapPyFuncWithSymbolic(): incompatible function arguments. The following argument types are supported:
1. (self: torch._C.Graph, arg0: function, arg1: List[torch::jit::Value], arg2: int, arg3: function) → iterator

Invoked with: graph(%0 : Long(48, 15)
%1 : Long(48)
%2 : Float(2, 48, 128)
%3 : Float(2, 48, 128)
%4 : Float(27297, 300)
%5 : Float(512, 300)
%6 : Float(512, 128)
%7 : Float(512)
%8 : Float(512)
%9 : Float(512, 300)
%10 : Float(512, 128)
%11 : Float(512)
%12 : Float(512)) {
%13 : Float(48, 15, 300) = aten::embedding[padding_idx=-1, scale_grad_by_freq=0, sparse=0](%4, %0), scope: RNN_ENCODER/Embedding[encoder]
%16 : Float(48, 15, 300), %17 : Handle = ^Dropout(0.5, False, False)(%13), scope: RNN_ENCODER/Dropout[drop]
%15 : Float(48, 15, 300) = aten::slicedim=0, start=0, end=9223372036854775807, step=1, scope: RNN_ENCODER/Dropout[drop]
%14 : Float(48, 15, 300) = aten::as_stridedsize=[48, 15, 300], stride=[4500, 300, 1], storage_offset=0, scope: RNN_ENCODER/Dropout[drop]
%18 : Long(48) = prim::Constantvalue=, scope: RNN_ENCODER
%76 : Float(502, 300), %77 : Long(15), %78 : Handle = ^PackPadded(True)(%16, %18), scope: RNN_ENCODER
%19 : Float(15!, 48!, 300) = aten::transposedim0=0, dim1=1, scope: RNN_ENCODER
%21 : Long() = aten::selectdim=0, index=47, scope: RNN_ENCODER
%20 : Long() = aten::as_stridedsize=[], stride=[], storage_offset=47, scope: RNN_ENCODER
%22 : Byte() = aten::leother={0}, scope: RNN_ENCODER
%24 : Float(7!, 48!, 300) = aten::slicedim=0, start=0, end=7, step=1, scope: RNN_ENCODER
%23 : Float(7!, 48!, 300) = aten::as_stridedsize=[7, 48, 300], stride=[300, 4500, 1], storage_offset=0, scope: RNN_ENCODER
%25 : Float(7, 48, 300) = aten::clone(%24), scope: RNN_ENCODER
%26 : Float(336, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%28 : Float(1!, 48!, 300) = aten::slicedim=0, start=7, end=8, step=1, scope: RNN_ENCODER
%27 : Float(1!, 48!, 300) = aten::as_stridedsize=[1, 48, 300], stride=[300, 4500, 1], storage_offset=2100, scope: RNN_ENCODER
%30 : Float(1!, 46!, 300) = aten::slicedim=1, start=0, end=46, step=1, scope: RNN_ENCODER
%29 : Float(1!, 46!, 300) = aten::as_stridedsize=[1, 46, 300], stride=[300, 4500, 1], storage_offset=2100, scope: RNN_ENCODER
%31 : Float(1, 46, 300) = aten::clone(%30), scope: RNN_ENCODER
%32 : Float(46, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%34 : Float(1!, 48!, 300) = aten::slicedim=0, start=8, end=9, step=1, scope: RNN_ENCODER
%33 : Float(1!, 48!, 300) = aten::as_stridedsize=[1, 48, 300], stride=[300, 4500, 1], storage_offset=2400, scope: RNN_ENCODER
%36 : Float(1!, 43!, 300) = aten::slicedim=1, start=0, end=43, step=1, scope: RNN_ENCODER
%35 : Float(1!, 43!, 300) = aten::as_stridedsize=[1, 43, 300], stride=[300, 4500, 1], storage_offset=2400, scope: RNN_ENCODER
%37 : Float(1, 43, 300) = aten::clone(%36), scope: RNN_ENCODER
%38 : Float(43, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%40 : Float(1!, 48!, 300) = aten::slicedim=0, start=9, end=10, step=1, scope: RNN_ENCODER
%39 : Float(1!, 48!, 300) = aten::as_stridedsize=[1, 48, 300], stride=[300, 4500, 1], storage_offset=2700, scope: RNN_ENCODER
%42 : Float(1!, 29!, 300) = aten::slicedim=1, start=0, end=29, step=1, scope: RNN_ENCODER
%41 : Float(1!, 29!, 300) = aten::as_stridedsize=[1, 29, 300], stride=[300, 4500, 1], storage_offset=2700, scope: RNN_ENCODER
%43 : Float(1, 29, 300) = aten::clone(%42), scope: RNN_ENCODER
%44 : Float(29, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%46 : Float(1!, 48!, 300) = aten::slicedim=0, start=10, end=11, step=1, scope: RNN_ENCODER
%45 : Float(1!, 48!, 300) = aten::as_stridedsize=[1, 48, 300], stride=[300, 4500, 1], storage_offset=3000, scope: RNN_ENCODER
%48 : Float(1!, 20!, 300) = aten::slicedim=1, start=0, end=20, step=1, scope: RNN_ENCODER
%47 : Float(1!, 20!, 300) = aten::as_stridedsize=[1, 20, 300], stride=[300, 4500, 1], storage_offset=3000, scope: RNN_ENCODER
%49 : Float(1, 20, 300) = aten::clone(%48), scope: RNN_ENCODER
%50 : Float(20, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%52 : Float(1!, 48!, 300) = aten::slicedim=0, start=11, end=12, step=1, scope: RNN_ENCODER
%51 : Float(1!, 48!, 300) = aten::as_stridedsize=[1, 48, 300], stride=[300, 4500, 1], storage_offset=3300, scope: RNN_ENCODER
%54 : Float(1!, 12!, 300) = aten::slicedim=1, start=0, end=12, step=1, scope: RNN_ENCODER
%53 : Float(1!, 12!, 300) = aten::as_stridedsize=[1, 12, 300], stride=[300, 4500, 1], storage_offset=3300, scope: RNN_ENCODER
%55 : Float(1, 12, 300) = aten::clone(%54), scope: RNN_ENCODER
%56 : Float(12, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%58 : Float(1!, 48!, 300) = aten::slicedim=0, start=12, end=13, step=1, scope: RNN_ENCODER
%57 : Float(1!, 48!, 300) = aten::as_stridedsize=[1, 48, 300], stride=[300, 4500, 1], storage_offset=3600, scope: RNN_ENCODER
%60 : Float(1!, 10!, 300) = aten::slicedim=1, start=0, end=10, step=1, scope: RNN_ENCODER
%59 : Float(1!, 10!, 300) = aten::as_stridedsize=[1, 10, 300], stride=[300, 4500, 1], storage_offset=3600, scope: RNN_ENCODER
%61 : Float(1, 10, 300) = aten::clone(%60), scope: RNN_ENCODER
%62 : Float(10, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%64 : Float(1!, 48!, 300) = aten::slicedim=0, start=13, end=14, step=1, scope: RNN_ENCODER
%63 : Float(1!, 48!, 300) = aten::as_stridedsize=[1, 48, 300], stride=[300, 4500, 1], storage_offset=3900, scope: RNN_ENCODER
%66 : Float(1!, 4!, 300) = aten::slicedim=1, start=0, end=4, step=1, scope: RNN_ENCODER
%65 : Float(1!, 4!, 300) = aten::as_stridedsize=[1, 4, 300], stride=[300, 4500, 1], storage_offset=3900, scope: RNN_ENCODER
%67 : Float(1, 4, 300) = aten::clone(%66), scope: RNN_ENCODER
%68 : Float(4, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%70 : Float(1!, 48!, 300) = aten::slicedim=0, start=14, end=15, step=1, scope: RNN_ENCODER
%69 : Float(1!, 48!, 300) = aten::as_stridedsize=[1, 48, 300], stride=[300, 4500, 1], storage_offset=4200, scope: RNN_ENCODER
%72 : Float(1!, 2!, 300) = aten::slicedim=1, start=0, end=2, step=1, scope: RNN_ENCODER
%71 : Float(1!, 2!, 300) = aten::as_stridedsize=[1, 2, 300], stride=[300, 4500, 1], storage_offset=4200, scope: RNN_ENCODER
%73 : Float(1, 2, 300) = aten::clone(%72), scope: RNN_ENCODER
%74 : Float(2, 300) = aten::viewsize=[-1, 300], scope: RNN_ENCODER
%75 : Float(502, 300) = aten::cat[dim=0](%26, %32, %38, %44, %50, %56, %62, %68, %74), scope: RNN_ENCODER
return ();
}
, <function _symbolic_pack_padded_sequence..pack_padded_sequence_trace_wrapper at 0x1c24e95950>, [16 defined in (%16 : Float(48, 15, 300), %17 : Handle = ^Dropout(0.5, False, False)(%13), scope: RNN_ENCODER/Dropout[drop]
), [15, 15, 14, 14, 13, 13, 13, 13, 13, 13, 12, 12, 11, 11, 11, 11, 11, 11, 11, 11, 10, 10, 10, 10, 10, 10, 10, 10, 10, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 8, 8, 8, 7, 7]], 2, <function _symbolic_pack_padded_sequence.._onnx_symbolic_pack_padded_sequence at 0x1c1be48378>

Could you post the code to text_encoder?
In your notebook it’s loaded from models.py, which seems to be missing.

I’m not that familiar with ONNX, but is there a reason, you are using _export instead of .export?
Exporting a model with an Embedding layer and multiple outputs works, so I would have to see your whole model to see the reason it’s failing.

I created a separate notebook in the same repo with models.py (imported) and the rest of AttnGAN project. My notebook is similar to pretrain_DAMSM.py but modified so I can import the model into production. I also switched to .export and got the same results. I’ll post the code on my git repo but you need to download the coco data just fyi. I didn’t push commit the attngan project by the way

^My project is the comment above. Below is another developer contributing to AttnGAN

Sorry, maybe I’m blind, but I still couldn’t find your model definition.
I just took the one defined from the other repo.
After some minor modifications, this code works for me:
EDIT: Sorry, my mistake. The code throws the same error and does not work!

import torch
import torch.nn as nn
import torch.onnx
from torch.nn.utils.rnn import pack_padded_sequence, pad_packed_sequence


# ############## Text2Image Encoder-Decoder #######
class RNN_ENCODER(nn.Module):
    def __init__(self, ntoken, ninput=300, drop_prob=0.5,
                 nhidden=128, nlayers=1, bidirectional=False):
        super(RNN_ENCODER, self).__init__()
        self.n_steps = 10
        self.ntoken = ntoken  # size of the dictionary
        self.ninput = ninput  # size of each embedding vector
        self.drop_prob = drop_prob  # probability of an element to be zeroed
        self.nlayers = nlayers  # Number of recurrent layers
        self.bidirectional = bidirectional
        self.rnn_type = 'LSTM'
        if bidirectional:
            self.num_directions = 2
        else:
            self.num_directions = 1
        # number of features in the hidden state
        self.nhidden = nhidden // self.num_directions

        self.define_module()
        self.init_weights()

    def define_module(self):
        self.encoder = nn.Embedding(self.ntoken, self.ninput)
        self.drop = nn.Dropout(self.drop_prob)
        if self.rnn_type == 'LSTM':
            # dropout: If non-zero, introduces a dropout layer on
            # the outputs of each RNN layer except the last layer
            self.rnn = nn.LSTM(self.ninput, self.nhidden,
                               self.nlayers, batch_first=True,
                               dropout=self.drop_prob,
                               bidirectional=self.bidirectional)
        elif self.rnn_type == 'GRU':
            self.rnn = nn.GRU(self.ninput, self.nhidden,
                              self.nlayers, batch_first=True,
                              dropout=self.drop_prob,
                              bidirectional=self.bidirectional)
        else:
            raise NotImplementedError

    def init_weights(self):
        initrange = 0.1
        self.encoder.weight.data.uniform_(-initrange, initrange)
        # Do not need to initialize RNN parameters, which have been initialized
        # http://pytorch.org/docs/master/_modules/torch/nn/modules/rnn.html#LSTM
        # self.decoder.weight.data.uniform_(-initrange, initrange)
        # self.decoder.bias.data.fill_(0)

    def init_hidden(self, bsz):
        weight = next(self.parameters()).data
        if self.rnn_type == 'LSTM':
            return (weight.new(self.nlayers * self.num_directions,
                                        bsz, self.nhidden).zero_(),
                    weight.new(self.nlayers * self.num_directions,
                                        bsz, self.nhidden).zero_())
        else:
            return weight.new(self.nlayers * self.num_directions,
                                       bsz, self.nhidden).zero_()

    def forward(self, captions, cap_lens, hidden, mask=None):
        # input: torch.LongTensor of size batch x n_steps
        # --> emb: batch x n_steps x ninput
        emb = self.drop(self.encoder(captions))
        #
        # Returns: a PackedSequence object
        cap_lens = cap_lens.data.tolist()
        emb = pack_padded_sequence(emb, cap_lens, batch_first=True)
        # #hidden and memory (num_layers * num_directions, batch, hidden_size):
        # tensor containing the initial hidden state for each element in batch.
        # #output (batch, seq_len, hidden_size * num_directions)
        # #or a PackedSequence object:
        # tensor containing output features (h_t) from the last layer of RNN
        output, hidden = self.rnn(emb, hidden)
        # PackedSequence object
        # --> (batch, seq_len, hidden_size * num_directions)
        output = pad_packed_sequence(output, batch_first=True)[0]
        # output = self.drop(output)
        # --> batch x hidden_size*num_directions x seq_len
        words_emb = output.transpose(1, 2)
        # --> batch x num_directions*hidden_size
        if self.rnn_type == 'LSTM':
            sent_emb = hidden[0].transpose(0, 1).contiguous()
        else:
            sent_emb = hidden.transpose(0, 1).contiguous()
        sent_emb = sent_emb.view(-1, self.nhidden * self.num_directions)
        return words_emb, sent_emb


model = RNN_ENCODER(27297)
captions = torch.empty(48, 15, dtype=torch.long).random_(27297)
cap_lens = torch.sort(torch.empty(48, dtype=torch.long).random_(1, 15), descending=True)[0]
hidden = (torch.randn(1, 48, 128), torch.randn(1, 48, 128))

output = model(captions, cap_lens, hidden)

torch.onnx.export(model, (captions, cap_lens, hidden), 'test.proto', verbose=True, export_params=True)

Could you compare your code with this one?

I tested out your code and I’m still getting the same error. Am I’m dealing with a software package that I’m missing?

I’m currently using a PyTorch version compiled from master.
Let me check the code with 0.4.0.

EDIT: It’s also working on 0.4.0. Which PyTorch version do you have?
Could you update to the current stable release? You will find the install instructions on the website.

I’m currently on 0.4.0 as well. How do I check I’m on master?

version