Here is my understanding:

```
for i in range(1,epoch+1):
model.train()
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
if val_print_show_time == i:
with torch.no_grad():
model.eval()
val_outputs = model(val_inputs)
# early stoping ......
```

I think I may make a mistake that I should move the `model.eval()`

outside of the `with torch.no_grad():`

```
if val_print_show_time == i:
model.eval()
with torch.no_grad():
val_outputs = model(val_inputs)
# early stoping ......
```

But my model’s performance is not good.

For the **regression**, especially for this kind of **multi-value prediction problem**, is there any better **loss function** or **optimizer** than **RMSE** and **SGD**(or Adam)?

**OTHER DETAILS**

Here is my **model output**

```
out = torch.cat((x1, x2, x3, x4), 1)
# print('after_concat', x1.shape) # (batch_size,4)
```

Here is my **loss function**

```
def RMSELoss_new(yhat, y):
return torch.mean(torch.sqrt(torch.mean((yhat - y) ** 2, 0)))
criterion = RMSELoss_new
```

And I use `SGD`

**optimizer** and in the main training loop, I *calculate the loss and optimize the model.*

```
optimizer.zero_grad()
outputs = net(inputs, tr_angles) # (batch_size,4)
# loss = torch.sqrt(criterion(outputs, labels))
loss = criterion(outputs, labels)
# print(loss) # tensor(107.8965, device='cuda:0', grad_fn=<SqrtBackward>)
loss.backward()
optimizer.step()
```