RuntimeError: expected scalar type Double but found Float training a GRU

I got this error:
RuntimeError: expected scalar type Double but found Float
even though my input is already torch.float64 and so do my model parameters, why then? anyone can explain?

train_df = df_final.filter(col('TIMESTEP')<90).toPandas()
test_df = df_final.filter(col('TIMESTEP')>=90).toPandas()

model = GRUModel(input_dim, hidden_dim, num_layers, output_dim).double()
weights = None

# convert model parameters to double
for param in model.parameters():
    param.data = param.data.double()

for timestep in sorted(train_df['TIMESTEP'].unique()):
    
    print('FITTING TIMESTEP: ' + str(timestep))
    
    train = train_df[train_df['TIMESTEP']==timestep]
    train_X = train[X].astype(np.float64)
    train_y = train[y].astype(np.float64)
    
    scaler_train_std = StandardScaler()  
    scaler_train_minmax = MinMaxScaler()
    scaler_train_y_minmax = MinMaxScaler()
    
    train_X[scaling_mapping['std']] = scaler_train_std.fit_transform(train_X[scaling_mapping['std']])
    train_X[scaling_mapping['MinMax']] =
scaler_train_minmax.fit_transform(train_X[scaling_mapping['MinMax']])
    train_y[y] = scaler_train_y_minmax.fit_transform(train_y[y])
    
    data_X_tensor = torch.tensor(train_X.values, dtype=torch.double)
    data_y_tensor = torch.tensor(train_y.values, dtype=torch.double)
    X_tensor = data_X_tensor.view(-1, 3, len(train_X.columns))
    y_tensor = data_y_tensor.view(-1, 3, len(train_y.columns))
    
    # creating the model
    
    # defining the loss function and optimizer
    criterion = nn.MSELoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

    # training the model
    for epoch in range(num_epochs):
        # forward pass
        
        outputs = model(X_tensor)
        loss = criterion(outputs[:, -1, :], y_tensor[:, -1, :])
        
        # backward pass and parameter update
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
        
        # print the loss
        if (epoch+1) % 10 == 0:
            print(f'Epoch {epoch+1}/{num_epochs}, Loss: {loss.item():.4f}')
    
    weights = model.state_dict()

As you can see I put many redundant code lines for trying to solve the problem but they do not work.

Could you post a minimal and executable code snippet reproducing the issue, please?
I don’t see any obvious issues in your code but also the model definition and other parts are missing, so I cannot debug it.

Hi,
thanks for replying. Here’s the issue:

RuntimeError                              Traceback (most recent call last)
File <command-3581638359988375>:41
     37 # training the model
     38 for epoch in range(num_epochs):
     39     # forward pass
---> 41     outputs = model(X_tensor)
     42     loss = criterion(outputs[:, -1, :], y_tensor[:, -1, :])
     44     # backward pass and parameter update

File /databricks/python/lib/python3.8/site-packages/torch/nn/modules/module.py:1051, in Module._call_impl(self, *input, **kwargs)
   1047 # If we don't have any hooks, we want to skip the rest of the logic in
   1048 # this function, and just call forward.
   1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051     return forward_call(*input, **kwargs)
   1052 # Do not call functions when jit is used
   1053 full_backward_hooks, non_full_backward_hooks = [], []

File <command-3296532906983130>:22, in GRUModel.forward(self, x)
     20 h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)#.to(device) 
     21 # Forward propagate 
---> 22 out, _ = self.gru(x, h0)  # out: tensor of shape (batch_size, seq_length, hidden_size)
     23 # Decode the hidden state of the last time step
     24 out = self.fc(out[:,-1,:])

File /databricks/python/lib/python3.8/site-packages/torch/nn/modules/module.py:1051, in Module._call_impl(self, *input, **kwargs)
   1047 # If we don't have any hooks, we want to skip the rest of the logic in
   1048 # this function, and just call forward.
   1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051     return forward_call(*input, **kwargs)
   1052 # Do not call functions when jit is used
   1053 full_backward_hooks, non_full_backward_hooks = [], []

File /databricks/python/lib/python3.8/site-packages/torch/nn/modules/rnn.py:837, in GRU.forward(self, input, hx)
    835 self.check_forward_args(input, hx, batch_sizes)
    836 if batch_sizes is None:
--> 837     result = _VF.gru(input, hx, self._flat_weights, self.bias, self.num_layers,
    838                      self.dropout, self.training, self.bidirectional, self.batch_first)
    839 else:
    840     result = _VF.gru(input, batch_sizes, hx, self._flat_weights, self.bias,
    841                      self.num_layers, self.dropout, self.training, self.bidirectional)

and here’s the model:

input_dim = len(X) # number of input features
output_dim = 1
hidden_dim = 20 # number of hidden units in GRU
num_layers = 1 # number of GRU layers
num_epochs = 50 # number of training epochs
learning_rate = 0.001 # learning rate of the optimizer

# defining the GRU model
class GRUModel(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, output_size):
        super(GRUModel, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.output_size = output_size
        self.gru = nn.GRU(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, output_size)

    def forward(self, x):
        # Set initial hidden and cell states 
        h0 = torch.zeros(self.num_layers, x.size(0), self.hidden_size)#.to(device) 
        # Forward propagate 
        out, _ = self.gru(x, h0)  # out: tensor of shape (batch_size, seq_length, hidden_size)
        # Decode the hidden state of the last time step
        out = self.fc(out[:,-1,:])
        #out = out.squeeze(dim=1)
        return out

thanks for helping me

Thanks for the code!
Your model works for me:

model = GRUModel(1, 1, 1, 1)
x = torch.randn(1, 1, 1)
out = model(x)
print(out)
# tensor([[0.6389]], grad_fn=<AddmmBackward0>)

ok thanks,
I’m working on databricks, do you think the error can be related to some dependencies? packages version?

No, I don’t think so, but feel free to add the missing code parts to make your code executable so that I could debug it.

Honestly there’s no other code to add, save the preprocessing on the dataset to get the final input dataframe. If I check the input tensor dtype I get float64, same for parameters, I do not know why I keep getting that error.

Your code is not executable so obviously parts are missing.
In any case, the initialization of h0 is wrong as its float32.