Hi,
I’ve been struggling with collecting a lot of different metrics in PyTorch’s train/val loops. I always end up with a lot of boilerplate code that looks a lot like this:
for epoch in range(num_epochs):
metric1_train = []
metric2_train = []
metric3_train = []
...
model.train()
for batch in train_loader:
...
metirc1_train.append(...)
metric2_train.append(...)
metric3_train.append(...)
...
wandb.log({
"train/metric1": np.mean(metirc1_train),
...
})
model.eval()
with torch.no_grad():
metric1_val = []
metric2_val = []
metric3_val = []
...
for batch in val_loader:
...
metirc1_val.append(...)
metric2_val.append(...)
metric3_val.append(...)
...
wandb.log({
"val/metric1": np.mean(metirc1_val),
...
})
I want to not have to repeat all the code in the validation part. I tried tackling this in a OOP way but it didn’t end up pretty. I’d also like it to be easy to add new metrics, this I couldn’t handle well.
I wanna ask if there is a standard, best practice way to do something like this that handles 10s of metrics?
Thanks!