Implementing Recurrence on an Arbitrary Layer

Andy_Markman · January 13, 2018, 1:43pm

Hi

I am getting an error while implementing recurrence on a layer (a highway network layer in this case):

weight = next(model.parameters()).data
hidden =  Variable(weight.new(args.bptt, args.batch_size, args.emsize).zero_())

def forward(self, sequence, hidden):
        
        transition = torch.FloatTensor(sequence[0].size())
        i = 0
        for x in sequence: 
            if(i==0):
                hidden[i] = x # Initially it is the input
            else:
                hidden[i] = hidden[i-1] # otherwise the previous time-step's output
            
            for layer in self.highway_layers: # This the recurrence over layers
                transition = layer(hidden[i]) # doing hidden[i] = layer(hidden[i]) gives error
                hidden[i] = transition
            i = i +1
                 
        return hidden, hidden # this should be the output as in rnn?

I have 2 questions:

What should be the output here? The output is usually a softmax over vocabulary for language models (I am extending the word level language model from pytorch/examples).
I get the error on the line layer(hidden[i]):

RuntimeError: in-place operations can be only used on variables that don’t share storage with any other variables, but detected that there are 2 objects sharing it

Or sometimes:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

or

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation

I am stuck at this error while implementing the recurrence.

Thanks for your help.