Choosing correct loss function

Amir_Alsheikh · November 12, 2018, 5:34am

Hello,

I currently have a network that is running, but my final outputs are not matching what they should, and I think the issue has to do with my loss function. The network takes in a 1x12 binary vector and outputs a 1x30 vector. Here’s my code:

#Define a Net class that models the architecture described above
#This will be a subclass of the generical nn.Model class
class Net(nn.Module):

   #constructor method includes definitions of weight matrices for the layers
   #these matrices will be available by calling the parameters() instance function
    def __init__(self):
        super(Net, self).__init__()
      
      #define the operations y = Wx + b associated with the connection weights
        self.item_to_rep = nn.Linear(8, 8)         #maps 8 inputs to 8 representation nodes
        self.rep_rel_to_hidden = nn.Linear(12, 15) #maps 12 representation & relation nodes to 15 hidden nodes
        self.hidden_to_out = nn.Linear(15, 30)     #maps 15 hidden nodes to 30 output nodes
        self.loss = nn.BCELoss()          #uses BCE loss function
   
   #propagates inputs to outputs using current weights in computation
   #backward method is not defined in this class - it's incorporated into PyTorch
    def forward(self, x):
      
      #split input into item and relation nodes
        item = x[:8]
        rel = x[8:]
      
        rep = F.relu(self.item_to_rep(item))
        temp = torch.cat([rep, rel],-1)
        hidden = F.relu(self.rep_rel_to_hidden(temp))
        output = F.sigmoid(self.hidden_to_out(hidden))
      
        return output

#main computation
def buildAndTrainNetwork():
    net = Net() #initialize network
    optimizer = optim.SGD(net.parameters(), lr=0.01) #specifies learning algorithm, rate
   
    
    targets, x = textToTensor() 

   #this method isn't done but here's something:
    for n in range(5000):
        for i in range(len(x[0])):
            y = net(x[i])            #forward-propagates inputs to outputs using current weights
            optimizer.zero_grad()     #clears gradient records to prepare for weights update
            error = net.loss(y,targets[i])     #computs output loss for current training instances
            error.backward()          #triggers the backpropagation of errors
            optimizer.step()          #updates weights based on backpropagated errors
        if n%500 == 0:
            print(error.data[0])
    print(y,targets[i])

I’ve tried switching to a bunch of different loss functions without having any luck.

Here’s a screenshot of my activation output vs target output after 5,000 epochs:

The activation of the 8th node is highest by a lot, but still isn’t close enough to 1.

Not really sure what’s going on here since I’m pretty new, and some insight would be highly appreciated. Thanks!

Alon · November 12, 2018, 8:41am

Try to train with mini-batches instead of one sample at a time.