How does torchaudio.transforms.DownmixMono work?


I am getting confused when I use torchaudio.transforms.DownmixMono.
first, I load my data with sound = torchaudio.load(). This is correct that sound[0] is two channel data with torch.Size is ([2, 132300]) and sound[1] = 22050, which is the sample rate.Then I use soundData = torchaudio.transforms.DownmixMono(sound[0]) to downsample. But the result looks weird with torch.Size([2, 1]). If I understand it correctly, I can get soundData, which has only one channel? What’s wrong with that?

I check the document as well, the input format should be: tensor (Tensor): Tensor of audio of size (c x n) or (n x c), what does (c x n) mean?

It looks like you are passing your data as [channels, length], so you should pass channels_first=True. While the docs says:

channels_first (bool): Downmix across channels dimension. Default: True

The default seems to be in fact None, which results in dim1 as the default channel dimension:

channels_first = None
ch_dim = int(not channels_first)
> 1

c x n should correspond to [channels, length].

Thanks for reporting this problem! I’ve created an issue here.

1 Like

Thanks for your reply! This is exactly where the problem is!

Hi @ptrblck,
How should I pass the channels_first keyword to the DownmixMono function?

I get the following error.
torchaudio.transforms.DownmixMono()(sound[0],channels_first = True)
__init__() got an unexpected keyword argument 'channels_first'

However this works but I get wrong dimensions as mentioned by OP:

You should pass it while initializing the transformation:

transform = torchaudio.transforms.DownmixMono(channels_first=True)

@ptrblck Thanks for the reply;
I tried that too… but can’t get it work

transform = torchaudio.transforms.DownmixMono(channels_first=True) __init__() got an unexpected keyword argument 'channels_first'

Just to add up; I downloaded torchaudio as follows:

!git clone
!git checkout 301e2e9
!python install

This argument was introduced after your specified commit hash.
If you look at the file, you’ll see that the __init__ method just contains a pass statement.

1 Like

Sure… got it! Thanks for the help, I will look for more recent hashes.

@ptrblck Tried all possible hashes but didn’t succeed.

Could you guide to a stable version/hash? I am on Google Colab.

Isn’t the current master working (5c9d33d)?

1 Like


Here’s what I tried:

!apt-get install sox libsox-dev libsox-fmt-all
!git clone
import os
!git checkout 5c9d33d  #301e2e9 d92de5b
!python install
import torchaudio

Gives me following error:
RuntimeError: Failed to parse the argument list of a type annotation: name 'Optional' is not defined

I just installed the current master without a problem and it seems your error might be related to this issue.

1 Like