Error related to in-place operation

scrungus · April 15, 2022, 1:31pm

I am getting a strange error related to inplace operations. I have a neural net:

class ActorNet(nn.Module):
    def __init__(self, obs_size, n_actions, depth, hidden_size = 512):
        super().__init__()

        self.actor_conv = nn.Sequential(nn.Conv1d(in_channels=1, out_channels=32,kernel_size=8,stride=4,padding=7),
        nn.ReLU(),
        nn.Conv1d(in_channels=32, out_channels=64,kernel_size=4,stride=2,padding=3),
        nn.ReLU(),
        nn.Conv1d(in_channels=64, out_channels=64,kernel_size=3,stride=1,padding=2),
        nn.ReLU(),
        )

        self.actor = nn.Sequential(
            nn.Linear(1088, hidden_size),
            nn.ReLU(),
            nn.Linear(hidden_size, n_actions),
            )

        self.critic = nn.Sequential(
            nn.Linear(1088, hidden_size),
            nn.ReLU(),
            nn.Linear(hidden_size, 1)
        )

    def forward(self, x):
        if x.dim() == 1:
            convs = self.actor_conv(x[None,...][None,...])
        else:
            convs = self.actor_conv(x)
        convs = convs.view(convs.shape[0],-1)
        logits = self.actor(convs)
        logits = torch.nan_to_num(logits)
        dist = Categorical(logits=logits)
        action = dist.sample()

        value = self.critic(convs)

        return dist, action, value

And a function which generates samples :

def make_batch(self):
        for i in range(self.hparams.epoch_steps):

            dist, action, val = self.actor(self.state)
            probs = dist.log_prob(action)
            next_state, reward, done, _ = self.env.step(action.item())
...

Which, when running loss.backward() throws the error:

RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [512, 19]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead.

I have isolated the cause to this line:

probs = dist.log_prob(action)

as the error does not occur when I take this out. Why would this throw an error like this?

Thanks for any help in advance

scrungus · April 15, 2022, 1:38pm

Update:

I’ve found that adding with torch.no_grad(): before getting the log probabilities solves the issue, but I don’t understand why.

KFrank · April 16, 2022, 1:49am

Hi Scrungus!

First, strip down your code as much as possible while still keeping the
“inplace” issue.

Then post a simple-as-possible, fully-self-contained script that reproduces
your issue, together with the output you get when you run it.

(Since the script you post will be fully self-contained, it will include where
and how you calculate loss and call loss.backward().)

Best.

K. Frank