The data in my train.csv
likes this:
[
[0, 0,……, 0],
[0, 1,……, 0],
[0, 2,……, 0],
[0, 3,……, 0],
[1, 0,……, 0],
[2, 0,……, 0],
[3, 0,……, 0],
[1, 1,……, 1],
[2, 1,……, 1],
[1, 2,……, 1],
[3, 1,……, 1],
]
The training set has 800,000 rows data and the test set has 20,000 rows data.
The value of y_train
is only 0 and 1, and there are more than 150 columns of data in csv.
I want to use pytorch to predict the probability of 0 and 1 with x_test
, how to do it?
Here is my code:
import pandas as pd
train = pd.read_csv(r'train.csv')
x_test = pd.read_csv(r'test.csv')
x_train = train[:, :-1]
y_train = train[:, -1]
I want 2 columns of probability, the first column is the probability of 0
, and the second column is the probability of 1
, such as this:
[[0.23334,0.76267]
……
[0.84984,0.15685]
[0.16663,0.83291]]
When I use keras
,I can do it like this:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense
from sklearn.preprocessing import LabelEncoder
from keras.utils import np_utils
data_train = np.array([
[0, 0, 0],
[0, 1, 0],
[0, 2, 0],
[0, 3, 0],
[1, 0, 0],
[2, 0, 0],
[3, 0, 0],
[1, 1, 1],
[2, 1, 1],
[1, 2, 1],
[3, 1, 1],
])
data_test = np.array([
[1, 3],
[0, 4],
[5, 0]
])
x_train = data_train[:, :-1]
y_train = data_train[:, -1]
x_test = data_test
encoder = LabelEncoder()
encoder.fit(y_train)
encoded_y = encoder.transform(y_train)
y_train = np_utils.to_categorical(encoded_y)
model = Sequential()
model.add(Dense(512, activation='relu', input_dim=2))
model.add(Dense(200, activation='relu'))
model.add(Dense(200, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['binary_accuracy'])
model.fit(x_train, y_train, epochs=5, batch_size=1, verbose=1)
predict = model.predict_proba(x_test, batch_size=1)
print(predict)
result:
[[0.46373594 0.53626406]
[0.99079037 0.00920963]
[0.72976 0.27024 ]]
But how to do it with pytorch?