Categorical distribution returning breaking

Hey, I am not too sure what is going wrong with my code, however, I am using a categorical distribution and getting a fairly weird error and am uncertain as to why. I did look around for a while online but nothing I found seemed to explain what exactly this issue was and how to get around it.

The error I am getting is as follows:

> ValueError: Expected parameter logits (Tensor of shape (1024, 6)) of distribution Categorical(logits: torch.Size([1024, 6])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
> tensor([[nan, nan, nan, nan, nan, nan],
>         [nan, nan, nan, nan, nan, nan],
>         [nan, nan, nan, nan, nan, nan],
>         ...,
>         [nan, nan, nan, nan, nan, nan],
>         [nan, nan, nan, nan, nan, nan],
>         [nan, nan, nan, nan, nan, nan]], device='cuda:0',
>        grad_fn=<SubBackward0>)

The code I am using to generate this is just a simple feed forward network.

class actor(nn.Module):
    def __init__(self, input_size, n_actions):
        super(actor, self).__init__()

        self.base = layer_init(nn.Linear(input_size, 512)) = layer_init(nn.Linear(512, n_actions), std=0.01)

    def forward(self, x):
        x = x.clone()
        x = self.base(x)
        x = torch.tanh(x)
        x =
        return x

The output of this just gets put into a categorical dist: probs = Categorical(logits=logits)
This seems to be where the error is occurring. The code does not break on the first run, it takes a couple hundred thousand steps before it breaks.

If anyone knows what the problem is and how to fix it, would appreciate it immensely.

Based on the error message it seems the actor is creating NaN outputs after a few iterations of training. Are you seeing an increase in the value range of its output during training, which could then overflow after a while?

Hi, yeah when I was attempting to debug it yesterday I noticed that the augmentation I had in my loss function started to give extremely large values. I did change this to clip the loss function if it goes out of a certain range and it seems like the error was fixed. So, I think you are correct and overflow is causing the issue.

Thank you for the assistance.