Pytorch to Skorch: zero_grad()

cjohns · August 31, 2022, 8:05pm

I’m looking to migrate an embedding model from pure torch to skorch so I can gridsearch to find the best parameters. In PyTorch, I set the embedding size, learning rate, epochs, and batch size. I set the skorch model up to do the same. Both models run and produce embeddings, but the produced embeddings are pretty different.

Going back through the loop vs. the skorch model, the only thing I can come up with is that the pure torch method uses a loop that sets the gradients back to zero. I don’t think that skorch is doing this, and I’m wondering if there is a way to implement it.

def fit(iterator, model, optimizer, criterion): 
  for x, y in iterator: 
    optimizer.zero_grad()
    y_hat = model(x.to(device))
    loss = criterion(y_hat, y.to(device))
    train_loss += loss.item() * x.shape[0]
    loss.backward()
    optimizer.step()
return train_loss / len(iterator.dataset))

InnovArul · August 31, 2022, 9:44pm

It seems skorch does zero_grad() as well.

Fit loop:

github.com

skorch-dev/skorch/blob/1491d5a611db5577e81cbeb03219efb2dec0df44/skorch/net.py#L1103-L1112


      
          for _ in range(epochs):
              self.notify('on_epoch_begin', **on_epoch_kwargs)
          
          
    self.run_single_epoch(iterator_train, training=True, prefix="train",
                                    step_fn=self.train_step, **fit_params)
          
          
    self.run_single_epoch(iterator_valid, training=False, prefix="valid",
                                    step_fn=self.validation_step, **fit_params)
          
          
    self.notify("on_epoch_end", **on_epoch_kwargs)

train_step:

github.com

skorch-dev/skorch/blob/1491d5a611db5577e81cbeb03219efb2dec0df44/skorch/net.py#L979-L1022


      
          def train_step(self, batch, **fit_params):
              """Prepares a loss function callable and pass it to the optimizer,
              hence performing one optimization step.
          
          
    Loss function callable as required by some optimizers (and accepted by
              all of them):
              https://pytorch.org/docs/master/optim.html#optimizer-step-closure
          
          
    The module is set to be in train mode (e.g. dropout is
              applied).
          
          
    Parameters
              ----------
              batch
                A single batch returned by the data loader.
          
          
    **fit_params : dict
                Additional parameters passed to the ``forward`` method of
                the module and to the train_split call.

This file has been truncated. show original