Here’s my code:
for epoch in range(10):
net.train()
# Good training.
for data in trainloader:
inputs, labels = data['images'], data['masks']
for idx in range(0, len(inputs), 7):
optimizer.zero_grad()
outputs = net(inputs[idx:idx + 7])
loss = criterion(outputs, labels[idx:idx + 7])
loss.backward()
optimizer.step()
# Bad validation.
net.eval()
test_loss = 0.0
test_times = 0
for data in testloader:
with torch.no_grad():
inputs, labels = data['images'], data['masks']
for idx in range(0, len(inputs), 7):
outputs = net(inputs[idx:idx + 7])
loss = criterion(outputs, labels[idx:idx + 7])
test_loss += loss.item()
test_times += 1
test_loss /= test_times
If I use torch.no_grad()
block, the cpu memory will continually increase untill OOM kill happens.
But once I remove the no_grad
, everything would be all right.
I tried del loss
or put the validation into a function, but the memory leak still happens.
Is my code wrong?