I’m trying to figure out how to run a tensor through a function, do some math inside that function, and compare the output to a predetermined “fitness goal” to be used in the loss function. I have started with a very simple example, but already run into issues.
This example is a bread shopping simulation, where every 3rd day bread goes on sale for 90% off. The input and output values just determine what % of your budget to spend on bread each day. Ideally it should end up [0, 0, 1, 0, 0, 1, 0, 0, 1] and so forth
def buybread(args): budget = 1000 cost = 10 totalbread = 0 days = range((len(args))) for i in days: if (i+1) % 3 == 0: cost = 1 else: cost = 10 spend = budget bought = int((args[i] * spend)/cost) totalbread += bought budget -= (bought * cost) argsum = torch.sum(args) fitness = torch.add(argsum, totalbread) return(fitness) class TwoLayerNet(nn.Module): def __init__(self): super().__init__() self.fc1 = nn.Linear(21,100) self.fc2 = nn.Linear(100,21) def forward(self, x): x = F.sigmoid(self.fc1(x)) x = F.sigmoid(self.fc2(x)) return x m = TwoLayerNet() loss_fn = nn.MSELoss() optimizer = optim.Adam(m.parameters(), lr=0.01) training_epochs = 30 fitnessgoal = torch.tensor(1000) blanktensor = torch.zeros([1, 21]) for i in range(training_epochs): fpass = m(blanktensor) fitness = buybread(fpass) loss = loss_fn(fitness, fitnessgoal.float()) print(i, fitness, loss) optimizer.zero_grad() loss.backward() optimizer.step()
Ideally it should learn to buy bread only on the discount days, which is every 3 days when a loaf costs 10x less. If played perfectly, the optimal result is buying 1,000 loaves for $1 each.
Instead, it converges on 129 loaves, no matter what I set the fitness goal to. Oftentimes it will buy 260 or more loaves on the first epoch, just by virtue of randomness, but usually settles on 129 within ten epochs.
I assume that what’s actually happening is that the variable totalbread is being disregarded entirely, and pytorch is optimizing for the highest value for argsum. When I add print(fpass) to the loop, I generally get something like this for the last epoch:
tensor([[0.9997, 0.9997, 0.9996, 0.9996, 0.9996, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9997, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996]], grad_fn=<SigmoidBackward>)
So it’s trying to spend almost the maximum allowed budget every day, including day 0 where most of it gets used up.
I had previously tried simply setting fitness = totalbread, but that ran into all sorts of problems with being the wrong type of variable, and eventually the loss not changing at all.