How to store tensors as tensors instead of strings?

I have 3 column in my pandas dataframe. Waveform, sample rate and score. Each entry in waveform is a n-dimensional torch tensor. Sample rate entries consists of simple scalar values and so are the scores.

I tried to store the tensors in a csv file but the waveform values are getting stored as strings. How to store torch tensors in csv file? Or is there a better alternative to csv ?

to solve the above issue I tried converting the waveform values to numpy nd array but still they get stored as strings into the csv.

Please help!

How do you save a waveform as a csv? Waveforms are sets of n numbers. Even if you store them as int16 you still need to save N huge numbers. And csv are text files. So instead of taking 16 bits for a single integer you are using way more memory as each digit of the number will be a char.

You can just save .wav files. The format itself stores the sr as well and you can encode the score in the filename.

Optionally the optimal thing would be HDF5 files which allow to store everything independently as raw data.

But sincerely, .wav files + scipy is the best you can do here as it’s super fast to load and you can use memory map too.

1 Like

How to use .wav files with scipy… are there any links you can guide me to please?

https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.read.html

https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.wavfile.write.html

1 Like