How does nn.optimizer work? How do they locate the parameters?

import torch
import torch.nn as nn

class LinearModel:

    def __init__(self, train_x, train_y):
        self.train_x = train_x
        self.train_y = train_y
        self.W = torch.tensor([0.5], requires_grad=True)
        self.b = torch.tensor([0.5], requires_grad=True)
        self.params = [self.W, self.b]
    
    def forward(self, x):
        self.params = [self.W, self.b]
        return self.W * x + self.b
    
    def train(self):
        criterion = nn.MSELoss()
        optimizer = torch.optim.SGD(self.params, lr=0.01)
        for i in range(10):
            print(self.W, self.b)
            optimizer.zero_grad()
            y_pred = self.forward(self.train_x)
            loss = criterion(y_pred, self.train_y)
            loss.backward()
            optimizer.step()

train_x = torch.Tensor([0,1,2,3])
train_y = torch.Tensor([1,3,5,7])

model = LinearModel(train_x, train_y)
model.train()
tensor([0.5000], requires_grad=True) tensor([0.5000], requires_grad=True)
tensor([0.6200], requires_grad=True) tensor([0.5550], requires_grad=True)
tensor([0.7300], requires_grad=True) tensor([0.6053], requires_grad=True)
tensor([0.8307], requires_grad=True) tensor([0.6513], requires_grad=True)
tensor([0.9230], requires_grad=True) tensor([0.6933], requires_grad=True)
tensor([1.0076], requires_grad=True) tensor([0.7318], requires_grad=True)
tensor([1.0851], requires_grad=True) tensor([0.7669], requires_grad=True)
tensor([1.1561], requires_grad=True) tensor([0.7990], requires_grad=True)
tensor([1.2212], requires_grad=True) tensor([0.8284], requires_grad=True)
tensor([1.2809], requires_grad=True) tensor([0.8552], requires_grad=True)

These are the inputs and outputs from my simple linear model. I want to ask why the values of self.W and self.b get updated?

Only the values of self.W and self.b are passed into self.params, how can the optimizer locate self.W and self.b to update their values?

You are passing self.params to the optimizer in:

optimizer = torch.optim.SGD(self.params, lr=0.01)

which is why they get updated.

As a side note: you could create trainable parameters via nn.Parameter, which would properly register them inside the module and then pass all parameters via model.parameters() to the optimizer, which would avoid creating the list manually.