I want to write a module to track and analyze the memory usage of model activations. Here’s my current statistics code. Can it provide reasonable estimates of GPU memory consumption by activations? I think the result will be right only if torch.cuda.memory_allocate will synchorize until forward finishes.
memory_before_forward = torch.cuda.memory_allocate()
# do forward
model(input)
memory_after_forward = torch.cuda.memory_allocate()
print("activation memory: ", (memory_after_forward - memory_before_forward) / 1024**3)