I’m getting very strange behavior during evaluation when I call
foward in a for-loop. I either error with CUDA out-of-memory or get some weird results back from the evaluation function when I know is wrong (it seems like the
correct variable gets resets or something). I think it’s related to this post
but I couldn’t tell from the comments how to free up the graph. Specifically, someone says:
This is because pytorch will build a the graph again and again, and all the intermediate states will be stored.
In training, the states will be cleared if you do backward.
then how do you clear the states during evaluation? My evaluation function is below:
def evaluate(self, data): correct = 0 total = 0 loader = self.train_loader if data == "train" else self.test_loader for step, (story, question, answer) in enumerate(loader): story = Variable(story) question = Variable(question) answer = Variable(answer) _, answer = torch.max(answer, 1) if self.config.cuda: story = story.cuda() question = question.cuda() answer = answer.cuda() pred_prob = self.mem_n2n(story, question) _, output_max_index = torch.max(pred_prob, 1) toadd = (answer == output_max_index).float().sum().data correct = correct + toadd total = total + captions.size(0) acc = correct / total return acc