Hey everyone!
I’ve been experimenting with a custom training loop in PyTorch and stumbled upon a conceptual hurdle. I get the basics, but I want to understand better how the optimizer works during training.
Here’s the basic structure I’m using:
python
Copy
Edit
for data target in loader:
optimizer.zero_grad()
output = model(data)
loss = criterion(output target)
loss.backward()
optimizer.step()
What I’m curious about: does optimizer.step() right away update the model weights using the gradients calculated from loss.backward()? And what happens if I change gradients in between—will those changes stick?
I’ve come across some posts here and there but would appreciate insights from those who have built custom training logic in more complex workflows.
Also, I’m picking up knowledge about MLOps and automation. If you have any advice or suggestions for resources on how to become a DevOps engineer while having deep learning experience, I’d be grateful!
Thanks in advance for your help!
Regards
pasiho