Did you check the type of your gradients? My guess is that they are doubles. If that is the case, then you can cast your model as torch.float() to get the grads as the correct type.
When you are training your model, after you call loss.backwards() , but before you call optimizer.step(), you can inspect the gradients of any parameter of the model that you are interested in.
You can do this by calling model.named_parameters(). This creates an iterator that you can go through to find any of the model’s parameters that you would like to look at. Once you have the tensor containing model parameters, you call .grad to view the gradient. By looking at the type of the output of this call, you can see if your gradients are doubles or floats.
Alternatively, you can just call model = model.float() on your model before training, but this might raise other errors if your data is of the double type.