Given a trained model (M), I’m interested in computing the utility of new (unseen) examples in a pool (for an active learning task). For this, I need to compute the magnitude of the gradient when M is trained on each new example. In code, it is something like:

```
losses, grads = [], []
for i in range(X_pool.shape[0]):
pred = model(X_pool[i:i+1])
loss = loss_func(pred, y_pool[i:i+1])
model.zero_grad()
loss.backward()
losses.append(loss)
grads.append(layer.weight.grad.norm())
```

However, this is quite slow when there is a large pool of examples, especially since this will be an inner loop in my scenario. How can this code be improved for efficiency?

I appreciate any suggestions or pointers. Thanks!