Help for converting sklearn model to PyTorch model

Mugdha · February 7, 2020, 9:51am

I want to implement a multi-class classifier for categorizing sentences in class labels 0, 1 and 2 using PyTorch. The input to the neural network will be an average of the word emeddings (vectors of 300 dimensions) of all the words that form a sentence. Hence the input size is 300.

This is the sklearn code for the same which I found here: https://github.com/mdvu15/CS488-Senior-Capstone/blob/master/classifierTrain.py

(typing out the code snippet)
X_train, X_test, y_train, y_test = train_test_split(V, y, test_size = 0.25) #25% of data set aside for testing

mlp = MLPClassifier(hidden_layer_sizes=(500, 20, 20, 20), max_iter=1000, batch_size=32,
warm_start=True, early_stopping= True) #Classifier object

mlp.fit(X_train, y_train)

This is the PyTorch code I have so far for this-

class Linear_Model(torch.nn.Module):

def init(self):
super().init()
self.device = torch.device(“cuda” if torch.cuda.is_available() else “cpu”)

self.fc1 = torch.nn.Linear(300, 500).to(device)
self.fc2 = torch.nn.Linear(500, 20).to(device)
self.fc3 = torch.nn.Linear(20, 20).to(device)
self.fc4 = torch.nn.Linear(20, 20).to(device)
self.fc5 = torch.nn.Linear(20, 3).to(device)

# activation functions

self.relu = torch.nn.ReLU().to(device)
self.softmax = torch.nn.Softmax(dim = 1).to(device)

def forward (self, input, flag):
fc_out = self.relu(self.fc1(input))
fc_out = self.relu(self.fc2(fc_out))
fc_out = self.relu(self.fc3(fc_out))
fc_out = self.relu(self.fc4(fc_out))
fc_out = self.softmax(self.fc5(fc_out))

return fc_out

net = Linear_Model()

criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(net.parameters(), lr = 0.001)

#Training

for epoch in range(n_epochs):
train_accuracy = [] #for one epoch
train_losses = [] #for one epoch
net.train()
for inputs, labels in trainloader:
inputs, labels = inputs.to(device), labels.to(device)
output = net(inputs.float(), 0)
loss = criterion(output.squeeze(), labels)
train_losses.append(loss.item())
loss.backward()
optimizer.step() #Stochastic gradient descent on mini batches
optimizer.zero_grad()

# calculate training accuracy
pred = torch.argmax(output.squeeze(), axis = 1)
print(pred)
correct_tensor = pred.eq(labels.view_as(pred))
correct = np.squeeze(correct_tensor.cpu().numpy()) if torch.cuda.is_available() else np.squeeze(correct_tensor.numpy())
num_correct = np.sum(correct)
train_accuracy += [num_correct / batch_size]

Is this code correct? If not, then can the corrections (also suggestions if any) be pointed out? Thanks!

ptrblck · February 8, 2020, 8:07am

nn.CrossEntropyLoss expects raw logits, as internally F.log_softmax will be used, so remove the softmax and rerun the code again.

Mugdha · February 8, 2020, 8:11am

Thank you so much for your reply! I will do that. Although how do I calculate training accuracy from raw logits? Because the 3 values that will be returned can be either both positive or negative (I guess).

ptrblck · February 8, 2020, 8:13am

The outputs will represent logits, so torch.argmax(output, 1) will yield the same prediction as you would get using softmax.
The higher the logit, the higher the probability.

Mugdha · February 8, 2020, 10:28am

Done. The model is converging much faster now. Thank you so much!