[Beginner] Data loading, weights initialization

training_samples = TensorDataset(X_train, y_train)
test_samples = TensorDataset(X_test, y_test)

train_loader = DataLoader(training_samples, batch_size=64, shuffle=True)
valid_loader = DataLoader(test_samples, batch_size=64, shuffle=True)

class DynamicNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):

        super(DynamicNet, self).__init__()
        self.input_linear = torch.nn.Linear(D_in, H)
        self.middle_linear = torch.nn.Linear(H, H)
        self.output_linear = torch.nn.Linear(H, D_out)
        self.init_hidden()
        self.init_weights()


    def forward(self, x):

        h_relu = self.input_linear(x).clamp(min=0)
        for _ in range(random.randint(0, 3)):
            h_relu = self.middle_linear(h_relu).clamp(min=0)
        y_pred = self.output_linear(h_relu)
        return y_pred
      

N, D_in, H, D_out = 64, 30, 100, 19

model = DynamicNet(D_in, H, D_out)

loss_fn = torch.nn.MSELoss(size_average=False)

learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(train_loader, 0):
        # get the inputs
        inputs, labels = data

        # wrap them in Variable
        inputs, labels = Variable(inputs), Variable(labels)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.data[0]
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Hey, Im a beginner to deep learning, I have experience with scikit-learn , shifting to pytorch for deep learning purposes, i tried running this code. but i’m getting this error. could anyone point me in how or what i should do?

TypeError: torch.addmm received an invalid combination of arguments - got (int, torch.FloatTensor, int, torch.DoubleTensor, torch.FloatTensor, out=torch.FloatTensor), but expected one of:
 * (torch.FloatTensor source, torch.FloatTensor mat1, torch.FloatTensor mat2, *, torch.FloatTensor out)
 * (torch.FloatTensor source, torch.SparseFloatTensor mat1, torch.FloatTensor mat2, *, torch.FloatTensor out)
 * (float beta, torch.FloatTensor source, torch.FloatTensor mat1, torch.FloatTensor mat2, *, torch.FloatTensor out)
 * (torch.FloatTensor source, float alpha, torch.FloatTensor mat1, torch.FloatTensor mat2, *, torch.FloatTensor out)
 * (float beta, torch.FloatTensor source, torch.SparseFloatTensor mat1, torch.FloatTensor mat2, *, torch.FloatTensor out)
 * (torch.FloatTensor source, float alpha, torch.SparseFloatTensor mat1, torch.FloatTensor mat2, *, torch.FloatTensor out)
 * (float beta, torch.FloatTensor source, float alpha, torch.FloatTensor mat1, torch.FloatTensor mat2, *, torch.FloatTensor out)
      didn't match because some of the arguments have invalid types: (int, torch.FloatTensor, int, torch.DoubleTensor, torch.FloatTensor, out=torch.FloatTensor)
 * (float beta, torch.FloatTensor source, float alpha, torch.SparseFloatTensor mat1, torch.FloatTensor mat2, *, torch.FloatTensor out)
      didn't match because some of the arguments have invalid types: (int, torch.FloatTensor, int, torch.DoubleTensor, torch.FloatTensor, out=torch.FloatTensor)

My goal is to do what anyone usually does in scikit-learn, fit and predict, i just want a single hidden layer with 100 neurons, and output isn’t a one-hot vector, it is 19 different classes going from 0,1,2 … 18. Thanks in advance.

Hi,

From the error message, the problem is that in some operations, you’re mixing single precision and double precision numbers.
If you did not change the default tensor type, your network should be in single precision, is your dataset double precision?
If so, do one of the two below, depending on what you want.
To change the dataset to single precision, so X_train, y_train = X_train.float(), y_train.float(). To change the model to double precision, do model = model.double().

hey,
that seemed to be the problem, i made it into model.double(), clearing one hurdle, got another hurdle, seems that the way im passing variables into loss_fn seems to be wrong. Can you help me out?

class DynamicNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):

        super(DynamicNet, self).__init__()
        self.input_linear = torch.nn.Linear(D_in, H)
        self.middle_linear = torch.nn.Linear(H, H)
        self.output_linear = torch.nn.Linear(H, D_out)


    def forward(self, x):

        h_relu = self.input_linear(x).clamp(min=0)
        for _ in range(np.random.randint(0, 3)):
            h_relu = self.middle_linear(h_relu).clamp(min=0)
        y_pred = self.output_linear(h_relu)
        return y_pred
      

N, D_in, H, D_out = 64, 30, 100, 19

# model = torch.nn.Sequential(
#     torch.nn.Linear(D_in, H),
#     torch.nn.ReLU(),
#     torch.nn.Linear(H, D_out),
# )

model = DynamicNet(D_in, H, D_out)
model = model.double()

loss_fn = torch.nn.MSELoss(size_average=False)

learning_rate = 1e-4
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

for epoch in range(2):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(train_loader, 0):
        # get the inputs
        inputs, labels = data

        # wrap them in Variable
        inputs, labels = Variable(inputs), Variable(labels)

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = model(inputs)
        loss = loss_fn(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.data[0]
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

Thats the updated script, and the error is:

TypeError: DoubleMSECriterion_updateOutput received an invalid combination of arguments - got (int, torch.DoubleTensor, torch.LongTensor, torch.DoubleTensor, bool), but expected (int state, torch.DoubleTensor input, torch.DoubleTensor target, torch.DoubleTensor output, bool sizeAverage)

Yes,
In the error message, it says that it expects something like (int, torch.DoubleTensor, torch.LongTensor, torch.DoubleTensor, bool) and got (int state, torch.DoubleTensor input, torch.DoubleTensor target, torch.DoubleTensor output, bool sizeAverage). You can see that the argument called target was given as a torch.DoubleTensor while it was expecting a torch.LongTensor. Indeed, the labels for the MSELoss should be indices (integer values) and thus be passed as LongTensors.
In your case, you can change all labels of your dataset by doing y_train = y_train.long() at the beginning of your script. Or if you want to keep it as double precision floating point number to be used somewhere else, you can change it at the last moment with loss = loss_fn(outputs, labels.long()) as well.

EDIT: I read the error message in the wrong order, it is currently a LongTensor but expects a DoubleTensor, so y_train = y_train.double(), sorry.

I did fix that, by converting y_train into double precision by y_train = y_train.long() , but i’m still getting the same error. with torch.LongTensor highlighted in red.

I read too quickly the error message, sorry, I edited my message above.

that solved that issue, thanks once again.

Here for another error, i know what’s making the following error,

RuntimeError: input and target have different number of elements: input[64 x 19] has 1216 elements, while target[64] has 64 elements at /opt/conda/conda-bld/pytorch_1503965122592/work/torch/lib/THNN/generic/MSECriterion.c:12

this helped me find the problem, my outputs is of size 64x19, while my labels is of size 64, i know my declaration has the mistake, can you help me find where that mistake is? So that outputs is of size 64.

The code is right on top. Thanks in advance!

I think the problem is the loss you’re using compared to what you want to do.
From your previous message, it seems like you want to do multiclass classification and that labels is the index of the class.
The loss you’re using is mean square error which just compares the values of each element with a 2-norm. In that case, both inputs needs to have same type and same size.
For multiclass classification, the classical loss is cross entropy loss that is a combination of a softmax and a negative log likelihood loss. This loss takes two inputs: the scores for each class of floating point type (FloatTensor or DoubleTensor) and of size Batch x nb_classes. The second input is the label which has to be of type integer (LongTensor here and should be of size Batch that contains the index of the correct label with values in [0, nb_classes-1].

Thanks a tonne!! that actually worked. My model got trained, atleast it printed Finished training!, didn’t knew that for multiclass classification problem one has to approach with CrossEntropyLoss.

Before leaving i have few more questions! I hope im not asking too much.

since im writing my custom net, now i need to test my net with the validation set, which looks like,

test_samples = TensorDataset(X_test, y_test)
valid_loader = DataLoader(test_samples, batch_size=64, shuffle=True)

but i tried straight away with

# pass it through the model
prediction = model(X_test)

# get the result out and reshape it
cpu_pred = prediction.cpu()
result = cpu_pred.data.numpy()
print(result)

as shown here.

Im getting this following error,

RuntimeError: save_for_backward can only save input or output tensors, but argument 0 doesn't satisfy this condition

how should I proceed for making a prediction for my X_test, if there are any other links, that would help me. Again thanks in advance.

The error message is not really good here :confused:
I guess the problem is that X_test is a Tensor and not a Variable?
Note that when you will not need gradients, you can create Variables with X_test_var = Variable(X_test, volatile=True) to notify the autograd engine that no gradient will be computed (and thus he can do some extra optimizations).

thanks that actually, solved the issue.

yeah actually, its my mistake, I sent it as a Tensor instead of Variable, also thanks for the tip regarding optimization, will keep it mind,now the error is gone.