Hi, I am trying on a new incremental learning method, which requires the following step:
The paper is here: ActiveLink: Deep Active Learning for Link Prediction in Knowledge Graphs
(There are i windows of data during each iteration i.)
- Temporarily update the model on the loss on Window i with the current model parameter.
- Use this temporary parameter to accumulate the loss on Window 0-i.
- The accumulated loss will be used to update the original parameter.
To achieve this, my code is below:
model.train() model.zero_grad() feed_dict = data_iterator.windows[i] previous_param = model.state_dict().copy() _temporary_update(model, feed_dict, inner_optimizer) def _closure(): """ Uses the current parameters of the model on the current and previous windows within the range, and returns the total loss on these windows. """ model.zero_grad() total_loss = 0 for window_dict in data_iterator.iter_from_list(i, window_limit): loss = model.loss(window_dict) loss.backward() total_loss += loss.item() model.load_state_dict(previous_param) return total_loss optimizer.step(closure=_closure)
My question is about the
closure() function. In Step 3, it is actually updating the model with
total_loss, which has been accumulated in the graph. I wonder if it is the correct way to do so? Will the optimizer update the model with
Thank you in advance.