GPU allocated memory problem

Hi, a question confused me for a long time. When I use batch gradient descend in pytorch, the allocated memory of gpu accumulates with the number of training within on batch. I failed to find where the mistake is, someone can help? Thank you very much. Some information below is the output of my code.

No.0*
allocated memory of GPU is:9.38Mb
cached memory of GPU is:30.00Mb
No.1*
allocated memory of GPU is:582.33Mb
cached memory of GPU is:602.00Mb
No.2*
allocated memory of GPU is:1157.52Mb
cached memory of GPU is:1174.00Mb
No.3*
allocated memory of GPU is:1730.43Mb
cached memory of GPU is:1748.00Mb
No.4*
allocated memory of GPU is:2304.48Mb
cached memory of GPU is:2320.00Mb
No.5*
allocated memory of GPU is:2878.59Mb
cached memory of GPU is:2890.00Mb
No.6*
allocated memory of GPU is:3455.23Mb
cached memory of GPU is:3482.00Mb
No.7*
allocated memory of GPU is:4029.53Mb
cached memory of GPU is:4054.00Mb
No.8*
allocated memory of GPU is:4604.16Mb
cached memory of GPU is:4628.00Mb
No.9*
allocated memory of GPU is:5179.34Mb
cached memory of GPU is:5200.00Mb
No.10*
allocated memory of GPU is:5754.85Mb
cached memory of GPU is:5774.00Mb

This increase in memory usage is often due to (accidentally) storing some tensors, which are attached to the computation graph, and thus the complete graph with it.
Could you check, if you are adding the model output, loss, or any other tensor with a valid .grad_fn to a list or any other container in each training step?
If so, use tensor.detach() or tensor.item() instead when appending it to the container.

Yes, sir. Your answer give me great inspiration. A model output in my code was assigned to requires_grad = True, which resulted in accumulation of GPU allocated memory with training step. Thanks for your kindness for answering.