Helloβ¦
I am extracting MFCC Features and saving them as (PNG) using matplot for audio classification using librosa. However, saving these images is taking long time.
Moreover, when I used the pre-trained weights of Efficient-Net_v2_m I am getting great results during the validation and on unseen data.
I have two questions:
Q1- How can I save the MFCC images faster?
This is how I am generating the MFCC images
y,sr = librosa.load ('Down/test.wav')
MFCC = librosa.feature.mfcc(y=y, sr = sr)
img = librosa.display.specshow(g)
plt.axis('off')
plt.savefig("Down/out.png", bbox_inches='tight', pad_inches = 0)
plt.show()
And then I am splitting images to training and testing for k-fold and unseen for checking the accuracy of the model at the end
def train (model, optimizer,cost, epochs, dataloader):
model.train()
total_loss = 0.0
for epoch in tqdm(range(0 , epochs), colour="yellow"):
print (f'****** Starting Epoch {epoch+1}: ******')
for i, data in enumerate (dataloader):
inputs= data["image"]
targets = data["target"]
inputs = inputs.to(device, dtype=torch.float)
targets = targets.to(device)
optimizer.zero_grad()
outputs = model(inputs)
loss = cost (outputs, targets)
loss.backward()
optimizer.step()
if i % 100 == 0:
total_loss, current = loss.item(), i * len(inputs)
print(f'loss:{total_loss:>7f} [{current:>5d}/{len(dataloader.dataset):>5d}]')
print('Training process has finished. Saving trained model.')
def val (fold, model,dataloader, results ):
model.eval()
print('---- |||| Starting Validation |||| ----')
save_path = f'models/model_fold_No_{fold+1}.pth'
torch.save(model.state_dict(), save_path)
correct, total = 0, 0
with torch.no_grad():
for i, data in tqdm (enumerate(dataloader), colour="green"):
inputs= data["image"]
targets= data["target"]
inputs = inputs.to(device, dtype=torch.float)
targets = targets.to(device)
outputs = model(inputs)
_, predicated = torch.max(outputs.data, 1)
total += targets.size(0)
correct += (predicated == targets).sum().item()
print('Accuracy for fold [%d]: %d %%' % (fold, 100.0 * correct / total))
print('--------------------------------')
results[fold] = 100.0 * (correct / total)
Q2- is my accuracy and k-fold validation is correct?
#############################
Please find training and validation information below:
MFCC was used instead of Spectrogram
10 Epochs
10 Folds
Results:
Fold 0: 99.75 %
Fold 1: 100.0 %
Fold 2: 99.92857142857143 %
Fold 3: 99.96428571428572 %
Fold 4: 100.0 %
Fold 5: 100.0 %
Fold 6: 100.0 %
Fold 7: 99.96428571428572 %
Fold 8: 99.96428571428572 %
Fold 9: 99.96428571428572 %
Average: 99.95357142857145 %
**(base)$ python3 src/eval.py **
$$$$$$$$$$$$$$$ Model Evaulations: $$$$$$$$$$$$
---- |||| Starting Validation |||| ----
Please Wait β¦: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 500/500 [00:36<00:00, 13.64it/s]
Accuracy for Model resulted from Fold [1]: 99.83 %
--------------------------------
---- |||| Starting Validation |||| ----
Please Wait β¦: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 500/500 [00:35<00:00, 13.94it/s]
Accuracy for Model resulted from Fold [2]: 99.97 %
##############################
I would appreciate any thoughts or ides
Thanks in advance