Dear all!

I would like to write this code for training a model of Resnet34 (train) from image data. I don’t understands some parts in particular row 6 it is correct to do calculation of the loss in this way? I use batch size of 4 and the same for row 23.

It is correct calculate the ACC after training in this way?

Thanks in advance for any help

If you are referring to line17, then yes, this is the standard way to calculate the loss:

```
loss = criterion(outputs_tr, labels_tr)
```

The script does neither import not define the metric functions (`mcor`

, `acc`

, `precision`

, `recall`

), so we cannot answer, if the implementation is correct.

Thanks so much for you kind help! I add that information:

```
from sklearn.metrics import accuracy_score as acc
from sklearn.metrics import confusion_matrix
from sklearn.metrics import matthews_corrcoef as mcor
from sklearn.metrics import precision_score as precision
from sklearn.metrics import recall_score as recall
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
```

I’m sorry I don’t explain well… my problem are refereed to this lines where I start to check the test :

```
if j % int(len(loader_train) / 2) == 0 and j != 0:
model.eval()
with torch.no_grad():
```

Here are performed in this way in some script I found after 10 accumulation… what is the right way to do if I have batch of 4 image at time?

The training batch size doesn’t matter, since you are using `loader_test`

.

The loss calculation looks wrong:

```
loss_test_avg = losses_sum / num_samples_test
mean_loss_train = losses_sum / (
len(loader_train) * loader_train.batch_size
)
```

I’m not sure which `reduction`

you are using in the `criterion`

, but based on the variable names I would assume `reduction='sum'`

?

On the other hand, since you are dividing by `num_samples_test`

(which is in fact the number of batches), you might be using the default `reduction='mean'`

?

The `mean_loss_train`

calculation doesn’t seem to be correct, since you are dividing the test loss by the number of samples.

II don’t found **reduction** variable. So the loss calculation are corrected in this way?

```
for epoch in range(EPOCHS):
for j, data in enumerate(loader_train):
global_i += 1
if j % 10 == 0:
print(time.time() - start_time)
start_time = time.time()
optimizer.zero_grad()
images_tr = data["data"].to(device)
labels_tr = torch.LongTensor(data["label"]).to(device)
outputs_tr = model(images_tr).to(device)
# backward
loss = criterion(outputs_tr, labels_tr)
loss.backward()
optimizer.step()
# check test set
if j % int(len(loader_train) / 2) == 0 and j != 0:
model.eval()
with torch.no_grad():
losses_sum = 0
num_samples_test = 0
for data_test in loader_test:
images_ts = data_test["data"].to(device)
labels_ts = torch.LongTensor(data["label"]).to(device)
outputs_ts = model.forward(images_ts)
loss_test_sum = criterion(outputs_ts, labels_ts).item()
losses_sum += loss_test_sum
num_samples_test += 1
loss_test_avg = losses_sum / num_samples_test
last_loss_test = loss_test_avg
val_epoch_tr_loss=loss.item()/len(loader_train)
losses_tr.append(val_epoch_tr_loss)
losses_ts.append(loss_test_avg)
del images_ts, labels_ts
iteration += 1
del images_tr, labels_tr
gc.collect()
model.train()
```

Is it correct? the loss of training are calculate on the number of data in the training and the loss of test are calculate with the number of element of training

Thanks for the kind help!

transfer learning tutorial In the tutorial I found also this way to do loss calculation

`running_loss += loss.item() * inputs.size(0)`

So I don’t understand which is the way to calculate correctly. inputs.size(0) correspond to the batch that I use?

The `running_loss`

calculation multiplies the averaged batch loss (`loss`

) with the current batch size, and divides this sum by the total number of samples.

In your example you are summing the averaged batch losses and divide by the number of batches.

This might create an offset, if your last batch is smaller than the others.

Your code snippet looks alright, if you want to ignore the potential offset in the loss calculation.

However, `val_epoch_tr_loss`

uses the loss of the last batch during training and divides by the number of batches in the training `DataLoader`

, which still seems to be wrong.

thanks for your help. So you suggest to do only this calculation:

```
losses_tr.append(loss.item())
print(
"Train_loss:{:.4f} Test_loss {:.4f}".format(
train_loss_t, last_loss_test
)
)
```

In the usual use case you would compute the average of both, the training and test loss.

Your suggested approach of using:

```
loss = criterion(outputs_ts, labels_ts)
losses_sum += loss.item()
num_samples += 1
...
loss_test_avg = losses_sum / num_samples
```

would work for both losses. Although you might see a slight bias (as explained before), it could be “good enough”.

Alternatively, you could use the AverageMeter from the ImageNet example and update the loss as seen here.