Recording loss history without I/O

Hi, I’d like to ask how to store cuda tensors without the need for I/O from GPU at the end of every training step.

Clearly below shows a negative example of how things should be done

# We assume that loss_history is an array
# and loss is a cuda tensor with size of [1]
loss_history.append(loss.item())

Does the following implementation avoid the I/O problem?

loss_history += [loss]

Please advice! Sorry for being a pytorch noob!

This will not send data to the cpu indeed. But you want to add a .detach() to make sure that the computational graph associated with loss is not kept around otherwise your memory usage is going to quickly explode.

1 Like