Non scalar backward and self mini batch implementation

InnovArul (Arul) December 9, 2018, 6:35am 5

Yes. It is accumulated as mentioned in the post.
This code looks fine:

How to implement accumulated gradient in pytorch (i.e. iter_size in caffe prototxt)

Here is the corrected code for i in range(num_iters): optimizer.zero_grad() batch_loss_value = 0 for m in range(M): (images, labels, indices) = train_loader.next(): outputs = net(Variable(images.cuda())) loss = criterion(outputs, Variable(labels.cuda())) loss.backward() batch_loss_value += loss.cpu().numpy()[0] optimizer.step() …

1 Like

show post in topic

Home
Categories
Guidelines
Terms of Service
Privacy Policy

Powered by Discourse, best viewed with JavaScript enabled