# Gradient seems to small when training parameters

Hi,

I am trying to train a matrix nxn in what looks like a simple scenario. However, I am struggling to obtain a well train matrix when n is big (let’s say above 30). I have the following equation : y = sigmoid(Wx) where W is a nxn matrix, x is the input vector (nx1) and y is the output vector (also nx1). The sigmoid function applies to every elements of Wx. When, I use a small n, I have no problem to converge back to the original matrix. When I use a bigger n, it’s seems difficult for PyTorch to converge. Here’s my simple class and my code:

``````class MyModel(nn.Module):

def __init__(self, W):
super(MyModel, self).__init__()
self.W = W

def forward(self, x):
``````
``````#Initialise true value

x = np.random.rand(1000, 100)  #Input

W_true = np.random.randn(100, 100)  #True matrix

y = np.zeros((x.shape, x.shape))

sigma = lambda x: ((1+np.exp(-x))**(-1))  #sigmoid

for i in range(1000):

y[i] = sigma(W_true @ x[i])   #True output

# Set up

x = torch.tensor(x, dtype=torch.float32)

W_true = torch.tensor(W_true, dtype=torch.float32)

W = torch.randn(100, 100, dtype=torch.float32, requires_grad=True)

W = nn.Parameter(W)

y = torch.tensor(y, dtype=torch.float32)

learning_rate = 100

model = MyModel(W=W)
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)
loss = nn.MSELoss()

n_iters = 300

for epoch in range(n_iters):
#Forward pass
y_pred = torch.zeros(x.shape, x.shape)
for i in range(1000):

y_pred[i] = model(x[i])

#loss
l = loss(y, y_pred)

I tried to play with learning rate (which needs to be pretty high which seems weird) and momentum but it does not seems to improve my situation ? Any ideas ? 