@Hengck @smth Hi, I have a quick question. As mentioned in here,
loss += criterion(outputs, Variable(labels.cuda()))
this will build the graph again and again inside the loop, which may increase memory usage. So should I just write
loss = criterion(outputs, Variable(labels.cuda()))
This will also accumulate the gradients, right? I am confusing about which one to use, “=” or “+=”? I just want to have the effect of “iter_size” in Caffe to train large models. Thanks.