Simple Model not learning

Saswatm123 · September 21, 2019, 12:58pm

I am making a model to learn the Sine function in domain x = [0, 2 * pi]. Very simple, and I have done it using C++ frameworks as a test to see if I know how to use the framework.

My model always returns a straight line at y = 0 after training, and a straight line with y between -1 and 1 before training. The Model’s structure is:

Linear(1, 20) -> Sigmoid -> Linear(20, 50) -> Sigmoid -> Linear(50, 50) -> Sigmoid -> Linear(50, 1).

This structure has worked for me in different frameworks, so I know using Sigmoid isnt the problem, and neither is the number of parameters. I believe the problem is in my usage of autograd.

Here is the code for my Net:

class Net(nn.Module):
    def __init__(self):
      super(Net, self).__init__()
      self.model = nn.Sequential(
      nn.Linear(1, 20),
      nn.Sigmoid(),
      nn.Linear(20, 50),
      nn.Sigmoid(),
      nn.Linear(50, 50),
      nn.Sigmoid(),
      nn.Linear(50, 1)
      )
            
    def forward(self, x):
        x = self.model(x)
        return x

Here is my training loop:

def train(net, trainloader, valloader, learningrate, n_epochs):
  net = net.train()
  loss = nn.MSELoss()
  optimizer = torch.optim.SGD(net.parameters(), lr = learningrate)
  
  for epoch in range(n_epochs):
    
    for X, y in trainloader:
        X = X.reshape(-1, 1)
        optimizer.zero_grad()
      
        outputs = net.forward(X)
        error   = loss(outputs, y)
        error.backward()
        optimizer.step()
      
    total_loss = 0
    for X, y in valloader:
        X = X.reshape(-1, 1).float()
        outputs = net(X)
        error   = loss(outputs, y.float() )
        total_loss += error.data
      
    print('Val loss for epoch', epoch, 'is', total_loss / len(valloader) )

Where trainloader and valloader are the data loaders for the training and validation data.

Here is my data creation:

data = np.repeat( np.arange(0, 2 * np.pi, step = .001), 7)

from sklearn.model_selection import train_test_split
train_X, test_X, train_y, test_y = train_test_split(data, np.sin(data), test_size = 1/5, random_state = 9122019, shuffle = True)
from torch.utils.data import DataLoader

train_X, val_X, train_y, val_y = train_test_split(train_X, train_y, train_size = .8, shuffle = False)

train_data  = Dataset(train_X, train_y)
val_data    = Dataset(val_X, val_y)
test_data   = Dataset(test_X, test_y)

trainloader = DataLoader(train_data, batch_size = 64, shuffle = True)
valloader   = DataLoader(val_data, batch_size = 64, shuffle = True)
testloader  = DataLoader(test_data, batch_size = 64, shuffle = True)

The final result after a large amount of epochs is a line at y = 0. The final result overall is always a straight line at the mean of the output data, so if i set Y = X, the final result is a horizontal line at ( X.max() - X.min() )/ 2

Can someone tell me where my code is wrong? Thank you.

Edit: My train is called as such:

net = Net()
train(net, trainloader, valloader, .0001, n_epochs = 4)

ptrblck · September 21, 2019, 1:15pm

You are accidentally broadcasting the output and target tensor inside your training loop and this should also raise a warning in the latest stable PyTorch version.
Add y = y.view(-1, 1) to your training and validation loop and try to run the code again.
PS: If you are using an older PyTorch version, I would recommend to update to the latest release.
Also, call the model directly as model(data) instead of model.forward(data), as this will make sure to register hooks etc.

Saswatm123 · September 21, 2019, 7:44pm

Added, but still doesn’t change what’s happening. My new training/validation code is as follows (only added 2 lines saying y = y.view(-1, 1) ):

def train(net, trainloader, valloader, learningrate, n_epochs):
    net = net.train()
    loss = nn.MSELoss()
    optimizer = torch.optim.SGD(net.parameters(), lr = learningrate)

    for epoch in range(n_epochs):

        for X, y in trainloader:
            X = X.reshape(-1, 1)
            y = y.view(-1, 1) #Added line
            optimizer.zero_grad()

            outputs = net(X)

            error   = loss(outputs, y)
            error.backward()
            optimizer.step()
                
        total_loss = 0
        for X, y in valloader:
            X = X.reshape(-1, 1).float()
            y = y.view(-1, 1) #Added line
            outputs = net(X)
            error   = loss(outputs, y)
            total_loss += error.data

        print('Val loss for epoch', epoch, 'is', total_loss / len(valloader) )