How to reduce the loss in a simple training any further

I will post my simplified code first:

``````import torch
import torch.nn.functional as F
import numpy as np

def argmax(x, axis=-1):
return F.one_hot(torch.argmax(x, dim=axis), list(x.shape)[axis]).float()

loss_function = torch.nn.MSELoss()

def optimization(V, U, E, beta, learning_rate):
V = V.cuda()
U = U.cuda()
E = E.cuda()

V_optim = V.detach().clone()

optimizer = torch.optim.Adam([V_optim], lr=learning_rate)

dif    =  1
epoch  =  1

num_epochs = 200
for epoch in range(num_epochs):
Q_R = argmax(U + beta * torch.transpose(V_optim, 0, 1))
loss = loss_function(V_optim, torch.matmul(Q_R * U, E) + beta * torch.matmul(Q_R, V_optim))
loss.backward(retain_graph=True)
optimizer.step()

print(f"Epoch {epoch}, Loss: {loss}")

return V_optim

if __name__ == "__main__":
n = 100
beta = 0.98
alpha = 0.03
delta = 1

kss = ((1 / beta - (1 - delta)) / alpha) ** (1 / (alpha - 1))
k = np.linspace(0.5 * kss, 1.4 * kss, n)

k_reshaped = k.reshape(-1, 1)
c = k_reshaped ** alpha + (1 - delta) * k_reshaped - k
c[c < 0] = 1e-11
c = np.log(c)

V = (np.log(kss ** alpha - delta * kss) / (1 - beta)) * torch.ones(n, 1, requires_grad=True)
U = torch.tensor(c, dtype=torch.float32)
E = torch.ones(n, 1)

learning_rate = 0.002

optimal_V = optimization(V, U, E, beta, learning_rate)
print(optimal_V)
``````

According to the attached code, the (mean) loss after 200 iterations is

``````    Epoch 199, Loss: 3.335637765999877e-09
``````

But the maximum single loss element in the last iteration result is

``````    tensor(0.0006, device='cuda:0', grad_fn=<MaxBackward1>)
``````

Based on this simple learning I wonder:

(1) is there any solution to reduce the loss, especially the single loss, any further?

(2) is this a problem concerning the choice of loss function type or the format of inputs?

Many thanks.