I built a deep neural net, let’s call it net()
which I train like so:
net = net()
loss = nn.BCEWithLogitsLoss()
net.cuda()
optimizer = optim.Adamax(net.parameters(),lr=0.001,weight_decay=1.0)
for epoch in range(epochs):
i = 0
running_cost = 0
for minibatch in minibatches:
X,Y = minibatch
out = net(X)
cost = loss(out,Y)
cost.backward()
optimizer.step()
running_cost += cost.data[0]
if(i%1000==0):
print(running_cost/1000)
running_cost = 0
I find that no matter what I set the weight_decay
parameter to, the running_cost
will not change. Is that expected behavior? I would have thought that the cost should be calculating the total cost including the weight_decay portion so that if weight_decay
is large, the cost should be correspondingly large as well.