Hello Juan!
First of all, thank you for spending your time reading and replying, I truly appreciate it.
After reading your response I changed my program to stop predicting and to print instead if the extracted features contain any NaN.
I have 2 folders ‘train’ and ‘val’, each with 2 folders inside, one for each of the classes (in my case ‘nv’ and ‘v’ which stand for not violent and violent, as I’m doing using the models to determine if a certain scene is or not violent) and each of these folders have 50 videos, so 200 videos total.
This whole situation occured with the videos in the ‘train’ folder, and since the videos were randomly selected, I tested all 100 videos, from the ‘nv’ and ‘v’ folders, extracting their features using the following code:
def audio_feature_extractor(video_name):
y, sr = librosa.load(video_name) # load file
mfccs = librosa.feature.mfcc(y=y,sr=sr,n_mfcc=20,norm='ortho') # extract mfccs
mfccs = mfccs / np.linalg.norm(mfccs) # normalize array (20,x)
quadmesh = librosa.display.specshow(mfccs) # convert array to quadmesh
fig = quadmesh.get_figure() # get figure from quadmesh
inputs = get_img_from_fig(fig) # get img from figure (3,224,224)
inputs = inputs/255
return inputs
and then I checked the resulting tensor to see if it contained any NaN’s with the following code:
# AUDIO ANALYSIS
audio_features = audio_feature_extractor(video_name)
print(torch.isnan(audio_features).any())
Unfortunately, every single clip came out without any NaN’s, so this does not appear to be the problem here.
If you have any clue about other issues that might cause this trouble, I’d love to test them, since I’ve been stagnated with this problem for the last 2 weeks
Kind Regards,
Francisco