Training a weight matrix

S_I_R_U_S · June 1, 2022, 6:15pm

Hi,

I am trying to train a weight matrix but I am having a hard time when working with a matrix that has a larger size (let’s say 500x500 or more). I have the following equation :

x(t+dt) = x(t) * (1-dt)+sigma(W * x(t) - u)*dt

Here, dt and u are some constants. W is a matrix nxn that I want to train and x(t) is a vector nx1. sigma is the sigmoid function that is apply to every element of the matrix. To put you into context, I want to train a matrix W (initially random) to converge to a specific matrix. To find the value at a time x(t+dt), I need the value of x at a time x(t). My code is able to converge to a close matrix. However, when working with a bigger matrix, I am unable to obtain the correct matrix. Here’s my code:

I begin by creating a random matrix of size nxn:

    W = torch.randn(n, n, dtype=torch.float32, requires_grad=True)
    W = nn.Parameter(W)

Then I set up my model :

    model = RNNModel(W=W, dt=dt, mu=mu)
    optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
    criterion= nn.MSELoss()

RNNModel is a class that I have created :

class RNNModel(nn.Module):

    def __init__(self, W, dt, mu):
        super(RNNModel, self).__init__()
        self.W = W
        self.dt = dt
        self.mu = mu
    
    def forward(self, x):
        return x * (1-self.dt) + (torch.sigmoid(torch.matmul(self.W, x) - self.mu))*self.dt

Then I do the training :

    #Training
    for epoch in range(n_iters):
        #Forward pass
        #I call my model to find x_pred
        #Loss
        loss = criterion(x, x_pred)

        #Gradient
        loss.backward()

       #update weight
        optimizer.step()
        optimizer.zero_grad()

As I mention up top, I train found my value x_pred with x. Then I modify my matrix W. Do you find a better way or do you have any tips to help me?

Thanks in advance!