I’m trying to figure out how to run a tensor through a function, do some math inside that function, and compare the output to a predetermined “fitness goal” to be used in the loss function. I have started with a very simple example, but already run into issues.
This example is a bread shopping simulation, where every 3rd day bread goes on sale for 90% off. The input and output values just determine what % of your budget to spend on bread each day. Ideally it should end up [0, 0, 1, 0, 0, 1, 0, 0, 1] and so forth
def buybread(args):
budget = 1000
cost = 10
totalbread = 0
days = range((len(args[0])))
for i in days:
if (i+1) % 3 == 0:
cost = 1
else:
cost = 10
spend = budget
bought = int((args[0][i] * spend)/cost)
totalbread += bought
budget -= (bought * cost)
argsum = torch.sum(args)
fitness = torch.add(argsum, totalbread)
return(fitness)
class TwoLayerNet(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(21,100)
self.fc2 = nn.Linear(100,21)
def forward(self, x):
x = F.sigmoid(self.fc1(x))
x = F.sigmoid(self.fc2(x))
return x
m = TwoLayerNet()
loss_fn = nn.MSELoss()
optimizer = optim.Adam(m.parameters(), lr=0.01)
training_epochs = 30
fitnessgoal = torch.tensor(1000)
blanktensor = torch.zeros([1, 21])
for i in range(training_epochs):
fpass = m(blanktensor)
fitness = buybread(fpass)
loss = loss_fn(fitness, fitnessgoal.float())
print(i, fitness, loss)
optimizer.zero_grad()
loss.backward()
optimizer.step()
Ideally it should learn to buy bread only on the discount days, which is every 3 days when a loaf costs 10x less. If played perfectly, the optimal result is buying 1,000 loaves for $1 each.
Instead, it converges on 129 loaves, no matter what I set the fitness goal to. Oftentimes it will buy 260 or more loaves on the first epoch, just by virtue of randomness, but usually settles on 129 within ten epochs.
I assume that what’s actually happening is that the variable totalbread is being disregarded entirely, and pytorch is optimizing for the highest value for argsum. When I add print(fpass) to the loop, I generally get something like this for the last epoch:
tensor([[0.9997, 0.9997, 0.9996, 0.9996, 0.9996, 0.9997, 0.9997, 0.9997, 0.9997,
0.9997, 0.9997, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996, 0.9996,
0.9996, 0.9996, 0.9996]], grad_fn=<SigmoidBackward>)
So it’s trying to spend almost the maximum allowed budget every day, including day 0 where most of it gets used up.
I had previously tried simply setting fitness = totalbread, but that ran into all sorts of problems with being the wrong type of variable, and eventually the loss not changing at all.