user12233
(user12233)
June 17, 2022, 4:48pm
1
I’m trying to implement the code from here using a custom data set. I’m able to get the code to run with the librispeech dataset but when I use my dataset I get the following:
Train Epoch: 1 [0/2875 (0%)] Loss: 10.740855
Then the next value for the loss would be NAN
Any help is appreciated!
I added gradient clipping here:
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), 5)
optimizer.step()
Once I do that I get the following:
Train Epoch: 1 [0/2875 (0%)] Loss: 10.740855
Segmentation fault (core dumped)
My dataset has clips from 3 to 14 seconds.
What loss function are you using? can you post the code block of how you pass arguments to it as well?
user12233
(user12233)
June 20, 2022, 7:17pm
3
Thanks for the reply! I’m using the CTC loss function:
criterion = nn.CTCLoss(blank=0).to(device)
Below is the block of code I use to train the model which is the arguments I pass to the loss function:
def train(model, device, train_loader, criterion, optimizer, scheduler, epoch, iter_meter, experiment):
model.train()
data_len = len(train_loader.dataset)
with experiment.train():
for batch_idx, _data in enumerate(train_loader):
spectrograms, labels, input_lengths, label_lengths = _data
spectrograms, labels = spectrograms.to(device), labels.to(device)
optimizer.zero_grad()
output = model(spectrograms) # (batch, time, n_class)
output = F.log_softmax(output, dim=2)
output = output.transpose(0, 1) # (time, batch, n_class)
loss = criterion(output, labels, input_lengths, label_lengths)
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), 5)
experiment.log_metric('loss', loss.item(), step=iter_meter.get())
experiment.log_metric('learning_rate', scheduler.get_lr(), step=iter_meter.get())
optimizer.step()
scheduler.step()
iter_meter.step()
if batch_idx % 100 == 0 or batch_idx == data_len:
print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
epoch, batch_idx * len(spectrograms), data_len,
100. * batch_idx / len(train_loader), loss.item()))