Dear all,
I am trying to train a model on a dataframe.
I was following a training at this address:
https://medium.com/swlh/my-first-work-with-pytorch-eea3bc82068
This is my code until the error pops up:
from datetime import time
import torch
import torch.nn as nn
import torch.nn.functional as F
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
import kaggle
price = pd.read_csv('D:/Kaggle Datasets/Beer/price.csv')
X = price.drop('price', axis=1).values
y = price['price'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
X_train = torch.FloatTensor(X_train)
X_test = torch.FloatTensor(X_test)
y_train = torch.LongTensor(y_train)
y_test = torch.LongTensor(y_test)
print(X_train)
print(len(X_train))
print(y_train)
class ANN(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(in_features=5, out_features=10)
self.fc2 = nn.Linear(in_features=10, out_features=4)
self.output = nn.Linear(in_features=4, out_features=1)
def forward(self, x):
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.output(x)
return x
model = ANN()
print(model)
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)
print(time)
epochs = 5
loss_arr = []
for i in range(epochs):
y_hat = model.forward(X_train)
loss = criterion(y_hat, y_train)
loss_arr.append(loss)
if i % 10 == 0:
print(f'Epoch: {i} Loss: {loss}')
optimizer.zero_grad()
loss.backward()
optimizer.step()
My dataframe is different from the dataframe in the reference. Mine looks like this:
A,B,C,D,E,price
70.00,1.30,6.00,132.00,66.00,55528.00
52.00,1.30,15.00,512.00,600.00,285792.00
53.00,1.00,9.00,105.00,44.00,56700.00
65.00,1.00,22.00,215.00,115.00,73068.00
58.00,1.30,6.00,186.00,112.00,101293.00
73.00,1.30,4.00,104.00,31.00,79100.00
77.00,1.15,11.00,36.00,71.00,74100.00
The purpose is to calculate the price column so this is not a classification.
This is the error:
ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
IndexError: Target 55528is out of bounds.
As you can see, the error points to the first label of my dataframe which happens to be first label of my train data (y_train) as well.
Maybe I should do a normalization but I am not sure since my experience and knowledge is limited.
I appreciate your help.