hi,
the task is a classification problem with resnet. my goal is to only back propagate the least loss in a batch. however there is a huge difference in two similiar expressions, which i dont know why.
the first one is
Loss_function = nn.CrossEntropyLoss(reduce=False)
for epoch in range(0,epoch_num):
for iter_num,data in enumerate(train_loader):
image,label=data
image=Variable(image).cuda(cuda_id)
label=Variable(label).cuda(cuda_id)
out=resnet50(image)
loss=Loss_function(out,label)
loss_min=torch.min(loss)
optimizer.zero_grad()
loss_min.backward()
optimizer.step()
the 2nd is
for epoch in range(0,epoch_num):
for iter_num,data in enumerate(train_loader):
image,label=data
image=Variable(image).cuda(cuda_id)
label=Variable(label).cuda(cuda_id)
out=resnet50(image)
loss=Loss_function(out,label)
loss_min,_=torch.min(loss,0)
optimizer.zero_grad()
loss_min.backward()
optimizer.step()
the 1st case taks twice memory as much as the 2nd case.