UnboundLocalError: local variable 'data' referenced before assignment

I am facing a “local variable ‘data’ problem” when I am doing train our model; Error is like this

Traceback (most recent call last):
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/trainnew.py”, line 92, in
run(args)
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/trainnew.py”, line 54, in run
trainer.run(train_loader, dev_loader, num_epochs=args.epochs)
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/libs/trainer.py”, line 221, in run
cv = self.eval(dev_loader)
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/libs/trainer.py”, line 208, in eval
for egs in data_loader:
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/libs/dataset.py”, line 143, in iter
for chunks in self.eg_loader:
File “/home/speech70809/anaconda3/envs/speech_denoiser/lib/python3.9/site-packages/torch/utils/data/dataloader.py”, line 521, in next
data = self._next_data()
File “/home/speech70809/anaconda3/envs/speech_denoiser/lib/python3.9/site-packages/torch/utils/data/dataloader.py”, line 561, in _next_data
data = self._dataset_fetcher.fetch(index) # may raise StopIteration
File “/home/speech70809/anaconda3/envs/speech_denoiser/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py”, line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/home/speech70809/anaconda3/envs/speech_denoiser/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py”, line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/libs/dataset.py”, line 42, in getitem
ref = [reader[key] for reader in self.ref]
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/libs/dataset.py”, line 42, in
ref = [reader[key] for reader in self.ref]
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/libs/audio.py”, line 120, in getitem
return self._load(index)
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/libs/audio.py”, line 138, in _load
samp_rate, samps = read_wav(
File “/media/speech70809/Data01/speech_donoiser_new/main/nnet/libs/audio.py”, line 38, in read_wav
samp_rate, samps_int16 = wf.read(fname)
File “/home/speech70809/anaconda3/envs/speech_denoiser/lib/python3.9/site-packages/scipy/io/wavfile.py”, line 707, in read
return fs, data
UnboundLocalError: local variable ‘data’ referenced before assignment

The error message seems like a bug in scipy/io/wavfile.py, possibly triggered by a corrupted wav file (if you can read some).
You could try to loop over your dataset to find the problematic indices and delete those from the dataset.

Best regards

Thomas

Thank you so much for your reply. But I am a beginner so how can I track corrupted files.

Personally, I’d aim for the low-tech solution of writing a short program that loops over the dataset and sees which indices fail, remove the file that causes it and repeat. If that is too much work (which it is if there are many corrupted files), an alternative could be to use Python’s os.walk to find all wav files and try to read them to get a list of files where it didn’t work and remove those from the dataset.

Best tregards

Thomas