I have a feed forward NN that I want to train several times and the get the best model.
For that I am doing something like:
#read data and store in Variable.
# Define model and train it.
for n in times_to_train:
model = define_model()
train()
My problem is that the performance is quite different when I use this loop and when I train one model a time (instead of using the for loop, execute the code several times).
My question is. Should I clear any variable or restart something before training inside the loop?
What have you defined in train()?
If you are creating a new model, you should also create a new optimizer.
I don’t understand the issue completely.
If you use the for loop you’ll get several different models.
Now if you unroll the loop and train several models, you get a completely different result?
I have just check what you suggest and used two data sets for comparing. The problems I see about very different training and validation errors appear just for one of the sets, the one with lower data points so I guess the problem is with my data points.
Thank you very much for the answer.
I also post the code of train() in case there is also something wrong.
def train(x_train, y_train, x_val, y_val, model, max_it):
loss = torch.nn.MSELoss(size_average=True)
opt = torch.optim.Rprop(model.parameters(), lr=0.5)
for epoch in xrange(max_it):
tr_loss = train_epoch(model, x_train, y_train, loss, opt)
v_loss = val_epoch(model, x_val, y_val, loss)
# some clauses for exiting if overfitting or convergency reached.
def train_epoch(model, x, y, loss, opt):
model.train()
y_pred = model(x)
loss_tr = loss(y_pred, y)
opt.zero_grad()
loss_tr.backward()
opt.step()
return loss_tr.data.item()
def val_epoch(model, x, y_, loss):
model.eval()
y_pred = model(x)
loss_val = loss(y_pred, y)
return loss_val.data.item()