Hi,
I’m trying to build a simple NN for a categorical classification problem, however, i’m not able to get any value out of the losses. I’m not sure if something is wrong with my layers or it’s just some syntax issue.
My dataset is a 5518x512 tensor (5518 observations, 512 features per observation) and my labels is a categorical 5518x1 tensor, which i converted into a 5518x5 one hot encoded tensor (5 classes).
My model is as follows:
class NN(tt.nn.Module):
def __init__(self):
super().__init__()
self.dense = tt.nn.Linear(in_features = 512, out_features = 5)
def forward(self,x):
y = self.dense(x)
return y
My optimizer and loss fxns:
criterion = tt.nn.CrossEntropyLoss()
optimizer = tt.optim.SGD(model.parameters(), lr=0.0001)
Now, when i initialize the model and do a forward pass, it works perfect, but, when i calculate the losses, i get nan, every time, even in the first iteration.
Example, one forward pass gives me:
>>> yp
tensor([[-0.0309, -0.0312, -0.0166, -0.0349, 0.0427],
[-0.0257, -0.0310, -0.0115, -0.0375, 0.0446],
[-0.0281, -0.0321, -0.0115, -0.0370, 0.0461],
...,
[-0.0266, -0.0240, -0.0497, -0.0416, 0.0226],
[-0.0145, -0.0241, -0.0463, -0.0558, 0.0208],
[-0.0247, -0.0249, -0.0480, -0.0471, 0.0220]], device='cuda:0', grad_fn=<AddmmBackward0>)
And my training labels are:
>>> y
tensor([[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
[1., 0., 0., 0., 0.],
...,
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.],
[0., 0., 0., 0., 1.]], device='cuda:0')
And after calculating the losses (criterion(yp,y)), i always get:
tensor(nan, device='cuda:0', grad_fn <DivBackward1>)
Any idea what could it be?
Thanks in advance!
PS: I’m still not sure if this is the best way to do a categorical classification (encoding, type of nn, etc), i’ve found many very different solutions and i’m not sure which one would be the best one (any advice is more than welcome)