Trained ASR model on the GPU and trying to evaluate but getting the errorRuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same - how do I move the data to the GPU

I’m trying to evaluate an ASR model that I trained on a GPU. From what I’ve read to evaluate I also need to move the data to the GPU but I’m struggling to do that. I’m trying to evaluate the ASR model by decoding the test data using pyctcdecode. The code I’m trying to implement is here. Specifically the code below:

import soundfile as sf
arr, _ = sf.read('1919-142785-0028.wav')
input_values = asr_processor(arr, return_tensors="pt", sampling_rate=16000).input_values  # Batch size 1
logits = asr_model(input_values).logits.cpu().detach().numpy()[0]

When I run the last line of code I get the following error:

errorRuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

Doing some research I think I have to move the data to the GPU as well but that’s where I’m struggling. I’ve tried to the following:

logits = model(input_values.to("cuda")).logits.cpu().detach().numpy()[0]
logits = model(input_values.to("cuda")).logits

That will run without producing the error but all the values of the logits are nans. So I feel like I’m still not doing it correctly.

array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=float32)

From the error message, it looks like your input is already on GPU, but your model is on CPU? What happens if you do model = model.to('cuda') before running inference?