Ignite doesn't repackage LSTM hidden properly? And slower to converge than hand-made training loop?

Trying to bring Ignite into my training pipeline, but I’ve come across some issues with it, any help would be appreciated.

Not sure if I’m doing something wrong, but I tried to use Ignite with a basic LSTM. It kept throwing “RuntimeError: Trying to backward through the graph a second time, but the buffers have already been freed”, caused by:

def forward(self, sentence):
        num_timesteps = sentence.size()[1]
        lstm_out, self.hidden = self.lstm(
            sentence.view(num_timesteps, 1, -1), self.hidden)
        score_space = self.hidden2output(lstm_out.view(num_timesteps, -1))
        scores = torch.sigmoid(score_space).view(score_space.size()[1], score_space.size()[0])

        return scores   

This was based on the tutorial code for Pytorch, and works with the training loop made from vanilla Pytorch.

I fixed the error by changing it to:

def forward(self, sentence):
        num_timesteps = sentence.size()[1]
        lstm_out, hidden_temp = self.lstm(
            sentence.view(num_timesteps, 1, -1), (self.hidden, self.cell))
        self.hidden = hidden_temp[0]
        self.cell = hidden_temp[1]
        self.hidden = self.hidden.detach()
        self.cell = self.cell.detach()
        score_space = self.hidden2output(lstm_out.view(num_timesteps, -1))
        scores = torch.sigmoid(score_space).view(score_space.size()[1], score_space.size()[0])

        return scores

Is this intended behaviour, and does my fix negatively impact something I haven’t taken into account?

And also, I find that it takes over 3x as many epochs to converge on a single sample using Ignite vs. my own training loop, could someone shed light on that behaviour?

Thanks!

Thanks for the feedback, we’ll take a look: https://github.com/pytorch/ignite/issues/475

Could you please provide the executable code to reproduce please ?

The LSTM was just a slightly modified PyTorch Sequence Tagging one modified to be a bi-LSTM and using sigmoid instead of softmax, and the Ignite loop was just the Quickstart code. The only difference was I used an Adam optimizer instead of SGD. And everything worked again once I implemented the fix in my original post.

I ended up scrapping LSTMs for my project in favour of 1DCNNs which have been working great with Ignite, but I can dig around my commit history for the original code and try it again if need be!

@LLYX thanks for the reply and glad that Ignite works for you!

We’ll try to port some NLP examples soon and will check the issue you encountered and will update this post.