The remaining memory will be used by the cuBLAS workspace and you can free it via:
torch._C._cuda_clearCublasWorkspaces()
print('after clearing the cuBLAS workspace', torch.cuda.memory_allocated())
after clearing the cuBLAS workspace 0
The remaining memory will be used by the cuBLAS workspace and you can free it via:
torch._C._cuda_clearCublasWorkspaces()
print('after clearing the cuBLAS workspace', torch.cuda.memory_allocated())
after clearing the cuBLAS workspace 0