I am trying to train a weight matrix but I am having a hard time when working with a matrix that has a larger size (let’s say 500x500 or more). I have the following equation :
x(t+dt) = x(t) * (1-dt)+sigma(W * x(t) - u)*dt
Here, dt and u are some constants. W is a matrix nxn that I want to train and x(t) is a vector nx1. sigma is the sigmoid function that is apply to every element of the matrix. To put you into context, I want to train a matrix W (initially random) to converge to a specific matrix. To find the value at a time x(t+dt), I need the value of x at a time x(t). My code is able to converge to a close matrix. However, when working with a bigger matrix, I am unable to obtain the correct matrix. Here’s my code:
I begin by creating a random matrix of size nxn:
W = torch.randn(n, n, dtype=torch.float32, requires_grad=True) W = nn.Parameter(W)
Then I set up my model :
model = RNNModel(W=W, dt=dt, mu=mu) optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate) criterion= nn.MSELoss()
RNNModel is a class that I have created :
class RNNModel(nn.Module): def __init__(self, W, dt, mu): super(RNNModel, self).__init__() self.W = W self.dt = dt self.mu = mu def forward(self, x): return x * (1-self.dt) + (torch.sigmoid(torch.matmul(self.W, x) - self.mu))*self.dt
Then I do the training :
#Training for epoch in range(n_iters): #Forward pass #I call my model to find x_pred #Loss loss = criterion(x, x_pred) #Gradient loss.backward() #update weight optimizer.step() optimizer.zero_grad()
As I mention up top, I train found my value x_pred with x. Then I modify my matrix W. Do you find a better way or do you have any tips to help me?
Thanks in advance!