Hi,
I am training a network for the video classification task, I am using Cross entropy loss for learning, the issue is that after many epochs my networks accuracy remains the same (1%) and loss is not coming down, I inspect the issue and noticed that the gradients are non-zero only for the layer before loss calculation, I also made sure that the requires_grad flag to be True for all network parameters.
here is my code:
optimizer = optim.Adam(net.parameters(), lr=args.lr)
criterion = nn.CrossEntropyLoss()
for epoch in range(args.start_epoch, args.epochs):
for i, data in enumerate(train_loader):
frames, labels = data
frames, labels = frames.cuda(), labels.cuda()
inputs = frames
optimizer.zero_grad()
outputs = net(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
I am pretty sure the problem is not with the optimizer since before even taking the step I am having the issue right after initiating the training session.
After doing the first back propagation the for the list(net.parameters())[-1] I have non zero gradients which corresponds to the bias of last fully connected layer, but for the rest of parameters they are all zeros.
I appreciate any suggestions about why I am having this issue,
Thanks in advance.