Hey, I am not too sure what is going wrong with my code, however, I am using a categorical distribution and getting a fairly weird error and am uncertain as to why. I did look around for a while online but nothing I found seemed to explain what exactly this issue was and how to get around it.
The error I am getting is as follows:
> ValueError: Expected parameter logits (Tensor of shape (1024, 6)) of distribution Categorical(logits: torch.Size([1024, 6])) to satisfy the constraint IndependentConstraint(Real(), 1), but found invalid values:
> tensor([[nan, nan, nan, nan, nan, nan],
> [nan, nan, nan, nan, nan, nan],
> [nan, nan, nan, nan, nan, nan],
> ...,
> [nan, nan, nan, nan, nan, nan],
> [nan, nan, nan, nan, nan, nan],
> [nan, nan, nan, nan, nan, nan]], device='cuda:0',
> grad_fn=<SubBackward0>)
The code I am using to generate this is just a simple feed forward network.
class actor(nn.Module):
def __init__(self, input_size, n_actions):
super(actor, self).__init__()
self.base = layer_init(nn.Linear(input_size, 512))
self.actor = layer_init(nn.Linear(512, n_actions), std=0.01)
def forward(self, x):
x = x.clone()
x = self.base(x)
x = torch.tanh(x)
x = self.actor(x)
return x
The output of this just gets put into a categorical dist: probs = Categorical(logits=logits)
This seems to be where the error is occurring. The code does not break on the first run, it takes a couple hundred thousand steps before it breaks.
If anyone knows what the problem is and how to fix it, would appreciate it immensely.