This discussion is a follow up from a previous one, which actually changed subject with my intervention :
I continue here to not bother all the participants of the previous discussion.
In order to answer Simon W. questions (at the bottom of previous discussion):
I see, however I changed quickly this, and results on training data improved. They go above 99% accuracy already at the second iteration (keeping padding into account).
char_input is initialised like this:
self.char_hidden = (autograd.Variable(torch.zeros(self.num_directions, self.batch_size, self.char_hidden_dim).type(dtype), requires_grad=False, volatile=vflag), autograd.Variable(torch.zeros(self.num_directions, self.batch_size, self.char_hidden_dim).type(dtype), requires_grad=False, volatile=vflag))
dtype is torch.cuda.FloatTensor when cuda is enabled (which is the case), otherwise it is torch.FloatTensor
vflag is a flag which is True when the model is run in testing mode.
It’s actually both in different senses:
Memory usage is increasing on RAM, slightly, but increasing.
Momory usage is large and constant on GPU. It takes 5.7GB for the model and only one batch of data (10 sentences).
A colleague of mine is loading the whole data set on GPU, which is also a bigger data set than mine, and his script takes only 2.5GB.
And when I run without cuda, my script takes 1GB in total on RAM (so for the whole dataset, not just the current batch, plus the model and few other stuffs).
Thank you once again for your answers.