I have been trying to carry out regression with my data from the following “my hdf file” to predict TWO outputs. But I have been getting different values for loss at different executions. Each execution produces 0 output value for either one or both the outputs.
The true output (targets) is the following:
label_batch tensor([[1.4000, 1.6000],
[1.4000, 1.6000],
[1.4000, 1.6000],
[1.4000, 1.6000],
[1.4000, 1.6000],
[1.4000, 1.6000],
[1.4000, 1.6000],
[1.4000, 1.6000],
[1.4000, 1.6000]])
Output for say Run 1, which are the values closest to the targets.
Epoch: 1/10.. Training Loss: 0.9370870.. Test Loss: 0.0499674..
Epoch: 2/10.. Training Loss: 0.0297776.. Test Loss: 0.0184589..
Epoch: 3/10.. Training Loss: 0.0117876.. Test Loss: 0.0087056..
Epoch: 4/10.. Training Loss: 0.0054326.. Test Loss: 0.0034268..
Epoch: 5/10.. Training Loss: 0.0020547.. Test Loss: 0.0008109..
Epoch: 6/10.. Training Loss: 0.0004802.. Test Loss: 0.0001630..
Epoch: 7/10.. Training Loss: 0.0001397.. Test Loss: 0.0000947..
Epoch: 8/10.. Training Loss: 0.0000927.. Test Loss: 0.0000680..
Epoch: 9/10.. Training Loss: 0.0000720.. Test Loss: 0.0000503..
output prediction
tensor([[1.4031, 1.6043],
[1.3971, 1.5961],
[1.3949, 1.5948],
[1.4034, 1.6042],
[1.3858, 1.5825],
[1.3948, 1.5934],
[1.3978, 1.5992],
[1.4024, 1.6062],
[1.4026, 1.6032]], grad_fn=<ReluBackward0>)
Epoch: 10/10.. Training Loss: 0.0000559.. Test Loss: 0.0000394..
Output for say Run 2,
Epoch: 1/10.. Training Loss: 1.3082893.. Test Loss: 0.9945563..
Epoch: 2/10.. Training Loss: 0.9902789.. Test Loss: 0.9884600..
Epoch: 3/10.. Training Loss: 0.9855704.. Test Loss: 0.9838181..
Epoch: 4/10.. Training Loss: 0.9823556.. Test Loss: 0.9815999..
Epoch: 5/10.. Training Loss: 0.9810459.. Test Loss: 0.9806178..
Epoch: 6/10.. Training Loss: 0.9803946.. Test Loss: 0.9801530..
Epoch: 7/10.. Training Loss: 0.9801204.. Test Loss: 0.9800670..
Epoch: 8/10.. Training Loss: 0.9800622.. Test Loss: 0.9800397..
Epoch: 9/10.. Training Loss: 0.9800435.. Test Loss: 0.9800270..
output prediction
tensor([[0.0000, 1.5930],
[0.0000, 1.5946],
[0.0000, 1.5916],
[0.0000, 1.5956],
[0.0000, 1.5970],
[0.0000, 1.5951],
[0.0000, 1.5919],
[0.0000, 1.5887],
[0.0000, 1.5913]], grad_fn=<ReluBackward0>)
Epoch: 10/10.. Training Loss: 0.9800317.. Test Loss: 0.9800202..
For Run 3,
Epoch: 1/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
Epoch: 2/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
Epoch: 3/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
Epoch: 4/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
Epoch: 5/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
Epoch: 6/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
Epoch: 7/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
Epoch: 8/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
Epoch: 9/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
output prediction
tensor([[0., 0.],
[0., 0.],
[0., 0.],
[0., 0.],
[0., 0.],
[0., 0.],
[0., 0.],
[0., 0.],
[0., 0.]], grad_fn=<ReluBackward0>)
Epoch: 10/10.. Training Loss: 2.2600000.. Test Loss: 2.2600000..
For Run 4,
Epoch: 1/10.. Training Loss: 1.3272166.. Test Loss: 0.9984156..
Epoch: 2/10.. Training Loss: 0.9895578.. Test Loss: 0.9840302..
Epoch: 3/10.. Training Loss: 0.9825955.. Test Loss: 0.9813274..
Epoch: 4/10.. Training Loss: 0.9807327.. Test Loss: 0.9804195..
Epoch: 5/10.. Training Loss: 0.9801892.. Test Loss: 0.9802050..
Epoch: 6/10.. Training Loss: 0.9801012.. Test Loss: 0.9801544..
Epoch: 7/10.. Training Loss: 0.9800708.. Test Loss: 0.9801207..
Epoch: 8/10.. Training Loss: 0.9800515.. Test Loss: 0.9800962..
Epoch: 9/10.. Training Loss: 0.9800386.. Test Loss: 0.9800771..
output prediction
tensor([[0.0000, 1.5929],
[0.0000, 1.5888],
[0.0000, 1.6203],
[0.0000, 1.6003],
[0.0000, 1.6016],
[0.0000, 1.5979],
[0.0000, 1.6009],
[0.0000, 1.5887],
[0.0000, 1.5899]], grad_fn=<ReluBackward0>)
Epoch: 10/10.. Training Loss: 0.9800294.. Test Loss: 0.9800624..
I assume it has something to do with the torch random seed because when I add
torch.manual_seed(0)
I always get the first of the TWO output values to be 0. i.e the output always resembles that of Run 4 of the above:
Epoch: 1/10.. Training Loss: 1.3272166.. Test Loss: 0.9984156..
Epoch: 2/10.. Training Loss: 0.9895578.. Test Loss: 0.9840302..
Epoch: 3/10.. Training Loss: 0.9825955.. Test Loss: 0.9813274..
Epoch: 4/10.. Training Loss: 0.9807327.. Test Loss: 0.9804195..
Epoch: 5/10.. Training Loss: 0.9801892.. Test Loss: 0.9802050..
Epoch: 6/10.. Training Loss: 0.9801012.. Test Loss: 0.9801544..
Epoch: 7/10.. Training Loss: 0.9800708.. Test Loss: 0.9801207..
Epoch: 8/10.. Training Loss: 0.9800515.. Test Loss: 0.9800962..
Epoch: 9/10.. Training Loss: 0.9800386.. Test Loss: 0.9800771..
output prediction
tensor([[0.0000, 1.5929],
[0.0000, 1.5888],
[0.0000, 1.6203],
[0.0000, 1.6003],
[0.0000, 1.6016],
[0.0000, 1.5979],
[0.0000, 1.6009],
[0.0000, 1.5887],
[0.0000, 1.5899]], grad_fn=<ReluBackward0>)
Epoch: 10/10.. Training Loss: 0.9800294.. Test Loss: 0.9800624..
I would like to attach my code and my hdf file here for reproducibility.
My code:
from pathlib import Path
import numpy as np
#np.random.seed(0)
import pandas as pd
import torch
#torch.manual_seed(0)
import matplotlib.pyplot as plt
from torch import nn, optim
from torch.utils.data import DataLoader, Dataset
import torch.nn.functional as F
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import sys
from sklearn.utils import shuffle
class Regressor(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(4, 144)
self.fc2 = nn.Linear(144, 72)
self.fc3 = nn.Linear(72, 18)
self.fc4 = nn.Linear(18, 2)
def forward(self, x):
#print("fc1", x.shape)
x = F.relu(self.fc1(x))
#print("fc2", x.shape)
x = F.relu(self.fc2(x))
#print("fc3", x.shape)
x = F.relu(self.fc3(x))
#print("fc4", x.shape)
x = F.relu(self.fc4(x))
#print("last", x.shape)
return x
p = Path.cwd()
fpath = p/"Nov20_2019_romean_entries/SigxFactor1.4_SigyFactor1.6_Nov20_2019.h5"
data = pd.read_hdf(str(fpath), key="df")
data = shuffle(data)
print(data.columns)
print(data.isnull().values.any())
targets = data[["x_val", "y_val"]]
print(targets)
data = data.drop(["x_val","y_val"], axis=1)
columns = data.columns
print("data b4 minmax")
print(data.head())
print("columns shape ", len(columns))
print("data shape ",data.shape)
scaler = MinMaxScaler()
data = pd.DataFrame(scaler.fit_transform(data), columns = columns)
#data['SalePrice'] = sale_price
print(data.head())
#sys.exit()
X_train, X_val, y_train, y_val = train_test_split(data, targets, test_size=0.2)
#print("feature shape ", X_train.shape)
#print(X_val.shape)
#
#print("target shape ", y_train.shape)
#print(y_val.shape)
train_batch = np.array_split(X_train, 50)
label_batch = np.array_split(y_train, 50)
print("train batch len ", len(train_batch))
print("label batch len ", len(label_batch))
#print(train_batch[49])
#print(train_batch[49].to_numpy().shape)
print("label batch")
print(label_batch[49].to_numpy().shape)
print(label_batch[49])
for i in range(len(train_batch)):
train_batch[i] = torch.from_numpy(train_batch[i].to_numpy()).float()
for i in range(len(label_batch)):
label_batch[i] = torch.from_numpy(label_batch[i].to_numpy()).float()
#label_batch[i] = torch.from_numpy(label_batch[i].to_numpy()).float().view(-1, 2)
print("label_batch ", label_batch[49])
print("label_batch shape ", label_batch[49].shape)
X_val = torch.from_numpy(X_val.to_numpy()).float()
y_val = torch.from_numpy(y_val.to_numpy()).float()
#y_val = torch.from_numpy(y_val.to_numpy()).float().view(-1, 2)
#print(len(train_batch))
#sys.exit()
#device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
#device = torch.device("cpu")
model = Regressor()
#model.to(dtype= torch.float64, device = device)
#ps = model(train_batch[0])
#print(ps.shape)
#print(ps)
#sys.exit()
#model = Regressor()
criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)
epochs = 10
#device =
train_losses, test_losses = [], []
for e in range(epochs):
model.train()
train_loss = 0
for i in range(len(train_batch)):
optimizer.zero_grad()
#model.to(device)
output = model(train_batch[i])
#output = model(train_batch[i].to(dtype= torch.float64, device= device))
loss = criterion(output, label_batch[i])
#loss = criterion(output, label_batch[i].to(dtype=torch.float64, device = device))
loss.backward()
optimizer.step()
train_loss += loss.item()
if e==9 and i==49:
print("output prediction")
print(output)
else:
test_loss = 0
accuracy = 0
with torch.no_grad():
model.eval()
predictions = model(X_val)
#predictions = model(X_val.to(dtype= torch.float64, device= device))
#if i==49:
# print("inside")
# print(predictions)
# print(predictions.shape)
#test_loss += torch.sqrt(criterion(torch.log(predictions), torch.log(y_val)))
test_loss += criterion(predictions, y_val)
train_losses.append(train_loss/len(train_batch))
test_losses.append(test_loss)
print("Epoch: {}/{}.. ".format(e+1, epochs),
"Training Loss: {:.7f}.. ".format(train_loss/len(train_batch)),
"Test Loss: {:.7f}.. ".format(test_loss))
I am wondering what might the exact reason behind this anomaly.
Thank you.