I’m trying to evaluate an ASR model that I trained on a GPU. From what I’ve read to evaluate I also need to move the data to the GPU but I’m struggling to do that. I’m trying to evaluate the ASR model by decoding the test data using pyctcdecode
. The code I’m trying to implement is here. Specifically the code below:
import soundfile as sf
arr, _ = sf.read('1919-142785-0028.wav')
input_values = asr_processor(arr, return_tensors="pt", sampling_rate=16000).input_values # Batch size 1
logits = asr_model(input_values).logits.cpu().detach().numpy()[0]
When I run the last line of code I get the following error:
errorRuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same
Doing some research I think I have to move the data to the GPU as well but that’s where I’m struggling. I’ve tried to the following:
logits = model(input_values.to("cuda")).logits.cpu().detach().numpy()[0]
logits = model(input_values.to("cuda")).logits
That will run without producing the error but all the values of the logits are nans. So I feel like I’m still not doing it correctly.
array([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], dtype=float32)