Here is my understanding:
for i in range(1,epoch+1):
model.train()
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if val_print_show_time == i:
with torch.no_grad():
model.eval()
val_outputs = model(val_inputs)
# early stoping ......
I think I may make a mistake that I should move the model.eval()
outside of the with torch.no_grad():
if val_print_show_time == i:
model.eval()
with torch.no_grad():
val_outputs = model(val_inputs)
# early stoping ......
But my model’s performance is not good.
For the regression, especially for this kind of multi-value prediction problem, is there any better loss function or optimizer than RMSE and SGD(or Adam)?
OTHER DETAILS
Here is my model output
out = torch.cat((x1, x2, x3, x4), 1)
# print('after_concat', x1.shape) # (batch_size,4)
Here is my loss function
def RMSELoss_new(yhat, y):
return torch.mean(torch.sqrt(torch.mean((yhat - y) ** 2, 0)))
criterion = RMSELoss_new
And I use SGD
optimizer and in the main training loop, I calculate the loss and optimize the model.
optimizer.zero_grad()
outputs = net(inputs, tr_angles) # (batch_size,4)
# loss = torch.sqrt(criterion(outputs, labels))
loss = criterion(outputs, labels)
# print(loss) # tensor(107.8965, device='cuda:0', grad_fn=<SqrtBackward>)
loss.backward()
optimizer.step()