Detect if an input number is odd or even

Hello everyone,

I’m trying to create a simple example with Pytorch that would detect if an input number is odd or even.

I do not know exactly if it is possible, if my understanding is correct pytorch uses a linear function whose coefficients are adjusted with training and then a sigmoid function which allows to trigger or not to create the classification.

I started coding something that creates the dataset and then performs the training.

Note that I use the EarlyStopping algorithm to stop the learning when the error goes bigger.

X, y = split_sequences(dataset, 1)
model = nn.Sequential(
                      nn.Linear(1, 1),
                      nn.Sigmoid())

patience = 3
early_stopping = EarlyStopping(patience=patience, verbose=True)
train_losses = []
valid_losses = []
avg_train_losses = []
avg_valid_losses = [] 


criterion = nn.BCELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.003)
epochs = 1000
for e in range(epochs):
    running_loss = 0
    model.train()
    batchi1 = 1
    for batchi in range(5000,len(X),5000):
        for x in range(batchi1,batchi):
            line = torch.tensor([X[x]],dtype=torch.float32)  
            out  = torch.tensor([y[x]],dtype=torch.float32)  
            optimizer.zero_grad()
            output = model(line)
            loss = criterion(output, out)
            train_losses.append(loss.item())
            loss.backward()
            optimizer.step()
            running_loss += loss.item()
        model.eval()
        for x in range(batchi1,batchi):
            line = torch.tensor([X[x]],dtype=torch.float32)  
            out  = torch.tensor([y[x]],dtype=torch.float32)  
            optimizer.zero_grad()
            output = model(line)
            loss = criterion(output, out)
            valid_losses.append(loss.item())
    train_loss = np.average(train_losses)
    valid_loss = np.average(valid_losses)
    avg_train_losses.append(train_loss)
    avg_valid_losses.append(valid_loss)

    epoch_len = len(str(train_episodes))

    print_msg = (f'[{e:>{epoch_len}}/{train_episodes:>{epoch_len}}] ' +
                 f'train_loss: {train_loss:.5f} ' +
                 f'valid_loss: {valid_loss:.5f}')        
    train_losses = []
    valid_losses = []
    early_stopping.trace_func = noprint
    early_stopping.path = 'NEURALNN.pt'
    early_stopping(valid_loss, model)
    print(f"Training loss: {running_loss/len(X)}")    

    if early_stopping.early_stop:
        print("Early stopping")
        break

    batchi1 = batchi1 + batchi
model.load_state_dict(torch.load('NEURALNN.pt'))

My example doesn’t really work because I’m getting a very high error:

Training loss: 24.53215186495483
Training loss: 24.53096993828714
Training loss: 24.530958538565038
Training loss: 24.537694978424906
Training loss: 24.537682025301457
Training loss: 24.53767285807431
Training loss: 24.53766483396888
Training loss: 24.537656717956065
Training loss: 24.53767231979668
Training loss: 24.537667768600585
Training loss: 24.537658959439398
Training loss: 24.537649419358374
Early stopping

<All keys matched successfully>

And with a test :

print(model(torch.tensor([[1]],dtype=torch.float32) ))
print(model(torch.tensor([[2]],dtype=torch.float32) ))
print(model(torch.tensor([[3]],dtype=torch.float32) ))
print(model(torch.tensor([[4]],dtype=torch.float32) ))
print(model(torch.tensor([[5]],dtype=torch.float32) ))

I get :

tensor([[0.4762]], grad_fn=<SigmoidBackward>)
tensor([[0.5165]], grad_fn=<SigmoidBackward>)
tensor([[0.5567]], grad_fn=<SigmoidBackward>)
tensor([[0.5961]], grad_fn=<SigmoidBackward>)
tensor([[0.6343]], grad_fn=<SigmoidBackward>)

Can you give me some advice on how to implement this problem?

Of course I am a beginner and looking to learn about pytorch, thank you for your indulgence.

Thank you in advance.

Note that the code to make my dataset is

dataset = []
for A in range(10000):
    B = 2
    C = 0
    if A%B == 0:
        C=1
    dataset.append([A,C])
dataset = np.array(dataset)

Based on your current model implementation (single linear layer with 1 weight and bias value) I doubt the model will be able to learn the dataset.

The increasing input numbers in [0, 10000] would only be multiplied with the weight value (thus scaled) and then the bias would be added (thus shifted). I don’t see how this operation can predict an oscillating target so you would need to change the model architecture (e.g. adding hidden layers etc.).

Hello,

@ptrblck Thank you for your answer. I took your advice and created a model that uses the sinus function with a training parameter that matches the frequency of the oscillation function. The objective of learning is then to find the right frequency. In the case of even or odd it is pi / 2. Here is my code below.
According to you is it possible to make it easier?

class AbsSinTensorFucntion(nn.Module):
    def __init__(self,alpha = None):
        super().__init__()
        if alpha == None:
            self.alpha = Parameter(torch.tensor(1.0))
        else:
            self.alpha = Parameter(torch.tensor(alpha)) 
        
        self.alpha.requiresGrad = True
        #alpha is a learnable parameter
        
    def forward(self, input):
        return torch.abs(torch.sin(self.alpha*input)) #Will return 0 on pair number and 1 on impair if alpha is pi/2


##CASE of pair impair
absSinInput = AbsSinTensorFucntion() #to be oscillating COS(X)+1

model = nn.Sequential(OrderedDict([
                      ('absSinInput', absSinInput)
                        ]))
optimizer = optim.Adam(model.parameters(), lr=0.001)
optimizer.zero_grad()
loss = nn.BCELoss() #because it's binary
##

savedalpha = absSinInput.alpha
print("**************************************")
print("Start alpha = " + str(savedalpha.item()))
print("**************************************")
#It should stop when alpha is around a multiple of pi/2
step = 0
bestvalidloss = np.inf
for epoch in range(50):
    for i in range(1,200,1):
        inputz = torch.tensor([[float(i)]], requires_grad=True, dtype=torch.float)
        target = torch.tensor([[float(i%2)]])
        bestloss = np.inf
        print("###############learn with " + str(i) + "###############")
        for improve in range(10): #for each sample try to find best alpha
            result = model(inputz)
            lossoutput = loss(result, target)
            lossoutput.backward()
            optimizer.step()
            step +=1
            print("loss output = " + str(lossoutput.item())+"#alpha = " + str(absSinInput.alpha.item()))
            if(lossoutput.item() < bestloss):
                bestloss = lossoutput.item()
                savedalpha = absSinInput.alpha
            else:
                absSinInput.alpha = savedalpha
                #print("stop and restore the best #alpha = " + str(absSinInput.alpha.item()))
                break
            validpred = model(torch.tensor([[555.0]]))
        validloss = loss(validpred, torch.tensor([[1.0]]))
        print("**************************************") 
        print("valid loss = " + str(validloss.item()))
        print("**************************************") 
        if(validloss.item() < bestvalidloss):
            bestvalidloss = validloss.item()
        else:
            break
print("**************************************")
print("Best alpha = " + str(absSinInput.alpha.item()))
print("pi/2 = " + str(np.pi/2))
print("**************************************")
print("Predict(1) = " + str(model(torch.tensor([[1]])).item()))
print("Predict(2) = " +  str(model(torch.tensor([[2]])).item()))
print("Predict(3) = " +  str(model(torch.tensor([[3]])).item()))
print("Predict(4) = " +  str(model(torch.tensor([[4]])).item()))
print("Predict(5) = " +  str(model(torch.tensor([[5]])).item()))
print("Predict(11111111) = " + str(model(torch.tensor([[11111111]])).item()))
print("Predict(990) = " +  str(model(torch.tensor([[990]])).item()))

I get the output : so it’s 1 when the input is an odd number and 0 for even number.

**************************************
Best alpha = 1.5897823572158813
pi/2 = 1.5707963267948966
**************************************
Predict(1) = 0.9998197555541992
Predict(2) = 0.037962935864925385
Predict(3) = 0.998378336429596
Predict(4) = 0.07587113976478577
Predict(5) = 0.9954975247383118
Predict(11111111) = 0.6603634357452393
Predict(990) = 0.053372591733932495

I was wondering if you’re setting model.eval() but not back to model.train() for the next batch.

Although the Universal Approximation Theorem says that we should be able to approximate most functions with just 1 hidden layer, your architecture looks too simple to learn even vs odd. So, as mentioned by @ptrblck, try increasing the layer size or add more linear layers and see if the results improve.

Hello,
I increase the layer size to six and it is not really better

**************************************
valid loss = 0.8264439105987549
**************************************
**************************************
Best alpha = Parameter containing:
tensor([1.0429, 0.9948, 0.9344, 0.9563, 0.9674, 1.0093], requires_grad=True)
pi/2 = 1.5707963267948966
**************************************
Predict([1.00,2.00,3.00,4.00,5.00,6.00]) = tensor([[0.8639, 0.9135, 0.3321, 0.6317, 0.9923, 0.2253]],
      grad_fn=<AbsBackward>)
Predict([7.00,8.00,9.00,10.00,11.00,12.00]) = tensor([[0.8506, 0.9945, 0.8497, 0.1382, 0.9379, 0.4391]],
      grad_fn=<AbsBackward>)
Predict([11.00,21.00,31.00,41.00,51.00,61.00]) = tensor([[0.8886, 0.8909, 0.6373, 0.9982, 0.8008, 0.9532]],
      grad_fn=<AbsBackward>)
Predict([122.00,222.00,333.00,444.00,564.00,688.00]) = tensor([[1.0000, 0.8100, 0.1261, 0.4811, 0.8569, 0.1180]],
      grad_fn=<AbsBackward>)
Predict([122.00,222.00,333.00,444.00,564.00,688.00]) = tensor([[0.0077, 0.5432, 0.9584, 0.3508, 0.9991, 0.3045]],
      grad_fn=<AbsBackward>)
Predict([1223.00,211.00,3334.00,411.00,544.00,644.00]) = tensor([[0.9194, 0.9865, 0.5616, 0.3956, 0.8123, 0.5632]],
      grad_fn=<AbsBackward>)

I add a train() and eval() on model, what is it for ?

Another try with a linear layer (6,6) before the sinus layer

**************************************
Best alpha = Parameter containing:
tensor([0.9966, 1.0089, 1.0270, 1.0027, 0.9761, 1.0345], requires_grad=True)
pi/2 = 1.5707963267948966
**************************************
Predict([1.00,2.00,3.00,4.00,5.00,6.00]) = tensor([[0.6157, 0.0282, 0.8108, 0.4695, 0.9369, 0.5707]],
       grad_fn=<AbsBackward>)
Predict([7.00,8.00,9.00,10.00,11.00,12.00]) = tensor([[0.2550, 0.4073, 0.5556, 0.7958, 0.4690, 0.8168]],
       grad_fn=<AbsBackward>)
Predict([11.00,21.00,31.00,41.00,51.00,61.00]) = tensor([[0.7027, 0.7106, 0.2578, 0.8775, 0.8660, 0.9601]],
       grad_fn=<AbsBackward>)
Predict([122.00,222.00,333.00,444.00,564.00,688.00]) = tensor([[0.9732, 0.3742, 0.2200, 0.0033, 0.2818, 0.1370]],
       grad_fn=<AbsBackward>)
Predict([122.00,222.00,333.00,444.00,564.00,688.00]) = tensor([[0.6943, 0.5853, 0.9817, 0.3463, 0.6813, 0.2033]],
       grad_fn=<AbsBackward>)

The code is kind of confusing and hard to debug. Make sure to use optimizer.zero_grad() within the loop each time before: