Memory Leak when Training Dynamic Network

Are you storing some tensors in a some container without detaching them?
Could you post a (small) reproducible code snippet, so that we can have a look?