Profiling with Utils.bottleneck error: unqualified exec with nested function

Hi all,

we are trying to profile a model that gests aroud 40% of GPU usage and above 600% CPU but we get the following error:

2 Traceback (most recent call last):
  3   File "/usr/lib64/python2.7/", line 151, in _run_module_as_main
  4     mod_name, loader, code, fname = _get_module_details(mod_name)
  5   File "/usr/lib64/python2.7/", line 109, in _get_module_details
  6     return _get_module_details(pkg_main_name)
  7   File "/usr/lib64/python2.7/", line 113, in _get_module_details
  8     code = loader.get_code(mod_name)
  9   File "/usr/lib64/python2.7/", line 283, in get_code
 10     self.code = compile(source, self.filename, 'exec')
 11   File "/usr/lib64/python2.7/site-packages/torch/utils/bottleneck/", line 101
 12     exec(code, globs, None)
 13 SyntaxError: unqualified exec is not allowed in function 'run_prof' it is a nested function

Here is a piece of code that can generate the error (filename

import torch
import torch.nn as nn
from torch.nn import functional as F
from torch.autograd import Variable
from torch import optim
import numpy as np
import math, random

# Generating a noisy multi-sin wave
def sine_2(X, signal_freq=60.):
    return (np.sin(2 * np.pi * (X) / signal_freq) + np.sin(4 * np.pi * (X) / signal_freq)) / 2.0

def noisy(Y, noise_range=(-0.05, 0.05)):
    noise = np.random.uniform(noise_range[0], noise_range[1], size=Y.shape)
    return Y + noise

def sample(sample_size):
    random_offset = random.randint(0, sample_size)
    X = np.arange(sample_size)
    Y = noisy(sine_2(X + random_offset))
    return Y

# Define the model
class SimpleRNN(nn.Module):

    def __init__(self, hidden_size):
        super(SimpleRNN, self).__init__()
        self.hidden_size = hidden_size
        self.inp = nn.Linear(1, hidden_size)
        self.rnn = nn.LSTM(hidden_size, hidden_size, 2, dropout=0.05)
        self.out = nn.Linear(hidden_size, 1)

    def step(self, input, hidden=None):
        input = self.inp(input.view(1, -1)).unsqueeze(1)
        output, hidden = self.rnn(input, hidden)
        output = self.out(output.squeeze(1))
        return output, hidden 

    def forward(self, inputs, hidden=None, force=True, steps=0):
        if force or steps == 0: steps = len(inputs)
        outputs = Variable(torch.zeros(steps, 1, 1))
        for i in range(steps):
            if force or i == 0:
                input = inputs[i]
                input = output
            output, hidden = self.step(input, hidden)
            outputs[i] = output
        return outputs, hidden

n_epochs = 100
n_iters = 50
hidden_size = 10
model = SimpleRNN(hidden_size)
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=0.01)
losses = np.zeros(n_epochs) # For plotting
for epoch in range(n_epochs):
    for iter in range(n_iters):
        _inputs = sample(50)
        inputs = Variable(torch.from_numpy(_inputs[:-1]).float())
        targets = Variable(torch.from_numpy(_inputs[1:]).float())
        # Use teacher forcing 50% of the time
        force = random.random() < 0.5
        outputs, hidden = model(inputs, None, force)
        loss = criterion(outputs, targets)
        losses[epoch] +=
    if epoch > 0:

We launch this with:
python -m torch.utils.bottleneck

torch version : 1.0.1

Any help is welcome.


Are you using an older Python2.7 version?
If so, could you try to run the code with Python>=3.5?

Also note, that Python2 is deprecated now, so you should definitely switch to Python3, if possible (and if you are really using the legacy version).

1 Like

Thanks… this is a quite complex reinforcement learning model so I’m not sure that it will be easy to translate to py3.
Do you think this is the reason why this does not work?
Is there any other method to GPU profile the code?
I’m pretty sure something is not being processed on the GPU as it should.