How to resolve target size being different from input size?

Hi, I’m new to ML and DL from a coding perspective.

I built the following regression NN to generate an estimate based on 8 features with 1134 datapoints.

import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torch.utils.data
import torch.optim as optim
import pandas as pd 
from sklearn.preprocessing import StandardScaler

#### IMPORT THE DATASET ####

dataset = pd.read_csv('./Mel.csv')
# columns of input data
maxCol = 8
startCol = 0
# redraw the dataset
datasetInterim = dataset.iloc[:, 0:9].values
# set up independent variables
x_temp = datasetInterim[:, startCol:maxCol]
sc = StandardScaler()
x_temp = sc.fit_transform(x_temp)
# set up dependent values
y_temp = datasetInterim[:, maxCol]

datasetFinal = torch.FloatTensor(datasetInterim)
X_train = torch.FloatTensor(x_temp)
Y_train = torch.FloatTensor(y_temp)

# batch the data
trainset = torch.utils.data.DataLoader(datasetFinal, batch_size = 1, shuffle = True)
testset = torch.utils.data.DataLoader(datasetFinal, batch_size = 1, shuffle = True)

# no errors above

#### MODEL THE ARCHITECTURE ####

class Net(nn.Module):
	def __init__(self):
		super().__init__()
		self.layer1 = nn.Linear(8, 64)
		self.layer2 = nn.Linear(64, 32)
		self.layerOutput = nn.Linear(32, 1)

	def forward(self, x):
		y_pred1 = F.relu(self.layer1(x))
		y_pred2 = F.relu(self.layer2(y_pred1))
		y_output_pred = self.layerOutput(y_pred2)

		return y_output_pred

net = Net()

# no errors above

#### START THE ENGINES ####

loss_function = nn.MSELoss()

optimizer = optim.Adam(net.parameters(), lr = 0.01)

EPOCHS = 500

for epoch in range(EPOCHS):
	y_output_pred = net(X_train)
	loss = loss_function(y_output_pred, Y_train)
	optimizer.zero_grad()
	loss.backward()
	print(loss)
	optimizer.step()

My code works, largely, the loss hovers at 260. However, I have encountered the following problems:

  1. Using a target size (torch.Size([1133])) that is different to the input size (torch.Size([1133, 1])) (1133 being the sample size). I can’t seem to resolve this.
  2. I dont know how to use it. It’s good to see the loss function converging, but I dont know how to test datapoints.

Do you guys know how to give me a hand in this? very new to this. Big thank you!

  1. Your Y_train is just the entire label column from your pandas dataset, which has the shape {1133} (your batch_size). The output from your network is of the shape {1133, 1}, because the last nn.Linear layer takes a 2d-tensor of the shape {batch_size, in_feature}, in this case {1133, 32} and returns a tensor of shape {batch_size, out_feature}, so {1133, 1}. [self.layerOutput = nn.Linear(32, 1)]
    If you now calculate the loss, you will get a warning because of the shape difference. Either just add a new dimension to your Y_train tensor at the end using:

    Y_train = torch.FloatTensor(y_temp)[..., None]
    OR
    Y_train = torch.FloatTensor(y_temp).unsqueeze(-1)
    

    or remove the last dimension of your output

    y_output_pred = y_output_pred[..., 0]
    OR
    y_output_pred= y_output_pred).squeeze(-1)
    
  2. To validate your network, you normally set it to validate mode (deactivate Dropout and BatchNorm) and deactivate the gradient tracking:

    net.eval()
    with torch.no_grad():
        test_out = net(X_test)
        loss = criterion(test_out, y_test)
    

Just one question, why dont you use the DataLoader you defined?

2 Likes

Thanks for helping me out on point 1.
As for data loader, I have changed up the code for #### start the engines #### as follows

for epoch in range(EPOCHS):
	for data in trainset:
		x_temp = data[:, 0:8]
		#x_temp = sc.fit_transform(x_temp)
		y_temp = data[:, 8][..., None]

		y_output_pred = net(x_temp)
		loss = loss_function(y_output_pred, y_temp)
		optimizer.zero_grad()
		loss.backward()
		optimizer.step()
	print(loss)

basically I moved the x_temp and y_temp down into the epochs
Now the problem is the loss function just won’t decrease anymore, instead it goes all over the place. Is there anyway to solve that?
Lastly, im definitely sure that my architecture here is not well designed at all, do you have any suggestions?

If its jittering all over the place, this could indicate the model isn’t learning at all. But i can’t really see a mistake in the training loop. Maybe you should increase the batch_size in your dataloader, because at first your batch_size was the entire dataset and now its only one at a time.


Well you could make your model deeper (more layer) or wider (more ‘neurons’ per layer), but only if the data really requires a more complex model and if you also have enough data! At first i would leave it as it is.