Hi,
I am trying to train a weight matrix but I am having a hard time when working with a matrix that has a larger size (let’s say 500x500 or more). I have the following equation :
x(t+dt) = x(t) * (1-dt)+sigma(W * x(t) - u)*dt
Here, dt and u are some constants. W is a matrix nxn that I want to train and x(t) is a vector nx1. sigma is the sigmoid function that is apply to every element of the matrix. To put you into context, I want to train a matrix W (initially random) to converge to a specific matrix. To find the value at a time x(t+dt), I need the value of x at a time x(t). My code is able to converge to a close matrix. However, when working with a bigger matrix, I am unable to obtain the correct matrix. Here’s my code:
I begin by creating a random matrix of size nxn:
W = torch.randn(n, n, dtype=torch.float32, requires_grad=True)
W = nn.Parameter(W)
Then I set up my model :
model = RNNModel(W=W, dt=dt, mu=mu)
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
criterion= nn.MSELoss()
RNNModel is a class that I have created :
class RNNModel(nn.Module):
def __init__(self, W, dt, mu):
super(RNNModel, self).__init__()
self.W = W
self.dt = dt
self.mu = mu
def forward(self, x):
return x * (1-self.dt) + (torch.sigmoid(torch.matmul(self.W, x) - self.mu))*self.dt
Then I do the training :
#Training
for epoch in range(n_iters):
#Forward pass
#I call my model to find x_pred
#Loss
loss = criterion(x, x_pred)
#Gradient
loss.backward()
#update weight
optimizer.step()
optimizer.zero_grad()
As I mention up top, I train found my value x_pred with x. Then I modify my matrix W. Do you find a better way or do you have any tips to help me?
Thanks in advance!