What’s the recommended method for GPU profiling? I installed the latest version of pytorch with conda, torch.__version__
reports 0.3.0.post4
, but when I try to call torch.autograd.profiler.profile(use_cuda=True)
I get the error __init__() got an unexpected keyword argument 'use_cuda'
. Is this feature only available in the version from the github repo?
The use_cuda
parameter is only available in versions newer than 0.3.0, yes. Even then it adds some overhead. The recommended approach appears to be the emit_nvtx
function:
with torch.cuda.profiler.profile():
model(x) # Warmup CUDA memory allocator and profiler
with torch.autograd.profiler.emit_nvtx():
model(x)
1 Like
Trying to run that code gives me an error about the use_cuda flag (with version 0.3.1). For example:
import torch
from torch.autograd import Variable
x = Variable(torch.randn(5,5), requires_grad=True).cuda()
with torch.autograd.profiler.profile() as prof:
y = x**2
with torch.autograd.profiler.emit_nvtx():
y = x**2
print(prof)
Gives:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-18-c54fc33dff6e> in <module>()
1 with torch.autograd.profiler.profile() as prof:
2 y = x**2
----> 3 with torch.autograd.profiler.emit_nvtx():
4 y = x**2
5
~/.pyenv/versions/3.6.1/envs/phdnets2/lib/python3.6/site-packages/torch/autograd/profiler.py in __enter__(self)
213 self.entered = True
214 torch.cuda.synchronize()
--> 215 torch.autograd._enable_profiler(True)
216 return self
217
I try to run the script on 0.4.0 and it works fine with torch.autograd.profiler.profile(use_cuda=True).
It seems this problem should be solved on upgrading to 0.4.0.
import torch
cuda = torch.device('cuda')
x = torch.randn((1, 1), requires_grad=True)
print(x.device)
with torch.autograd.profiler.profile(use_cuda=True) as prof:
y = x ** 2
y.backward()
print(prof)
2 Likes