Hubert onnx model inference problem

Hi,

I am converting Hubert model to onnx format with this script:

import torch
import torchaudio
import numpy as np
import soundfile as sf
import torch.nn.functional as F

import onnx
import onnxruntime

device="cpu"
# https://pytorch.org/audio/stable/pipelines.html#hubert-large
bundle = torchaudio.pipelines.HUBERT_LARGE
model = bundle.get_model().to(device)

audio_file = "sample.wav"   # shape: torch.Size([1, 101467])
x, sr = sf.read(audio_file, dtype='float32')
x = torch.Tensor(x).unsqueeze(0).cpu()
x = F.layer_norm(x, x.shape)

model_path = "torchaudio_hubert_large.onnx"
torch.onnx.export(model, x, 'torchaudio_hubert_large.onnx', input_names=['input'], output_names=['output'])
model = onnx.load(model_path)
model.graph.input[0].type.tensor_type.shape.dim[1].dim_param = '?'
onnx.save(model, model_path.replace(".onnx", "_dyn.onnx"))

Then I am trying to infer a sample with this code:

model_path = "torchaudio_hubert_large_dyn.onnx"
ort_session = onnxruntime.InferenceSession(model_path)
feat = ort_session.run(None, {'input': x.numpy().astype(np.float32)})

In this step, this error occurs:

---------------------------------------------------------------------------
RuntimeException                          Traceback (most recent call last)
/tmp/ipykernel_13203/1463404001.py in <module>
      1 print(x.shape)
----> 2 feat = ort_session.run(None, {'input': x.numpy().astype(np.float32)})[0]
      3 print(feat.shape)

/path/to/miniconda3/envs/onnx/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py in run(self, output_names, input_feed, run_options)
    190             output_names = [output.name for output in self._outputs_meta]
    191         try:
--> 192             return self._sess.run(output_names, input_feed, run_options)
    193         except C.EPFail as err:
    194             if self._enable_fallback:

RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'Reshape_208' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:41 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, std::vector<long int>&, bool) gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{1,316,1024}, requested shape:{1,49,16,64}

How can I solve this?

Hi @yunusemre, thanks for raising this issue. I have tried the code on my local machine and I can successfully run your script.
Could you post your onnx and torch version and PyThon version as well? You can get the environment information by this script: https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
Thanks!

My environment is:

PyTorch version: 1.7.1
Is debug build: False
CUDA used to build PyTorch: 11.0

Python version: 3.9.2
Is CUDA available: True
CUDA runtime version: 9.1.85

Versions of relevant libraries:
[pip3] numpy==1.21.0
[pip3] pytorch-lightning==1.1.6
[pip3] torch==1.7.1
[pip3] torch-lr-finder==0.2.1
[pip3] torch-tb-profiler==0.3.1
[pip3] torchaudio==0.7.0a0+a853dff
[pip3] torchmetrics==0.3.2
[pip3] torchvision==0.8.2
[conda] blas 1.0 mkl anaconda
[conda] cudatoolkit 11.0.3 h15472ef_8 conda-forge
[conda] mkl 2020.2 256 anaconda
[conda] numpy 1.21.0 pypi_0 pypi
[conda] pytorch 1.7.1 py3.9_cuda11.0.221_cudnn8.0.5_0 pytorch
[conda] pytorch-lightning 1.1.6 pypi_0 pypi
[conda] torch-lr-finder 0.2.1 pypi_0 pypi
[conda] torch-tb-profiler 0.3.1 pypi_0 pypi
[conda] torchaudio 0.7.2 py39 pytorch
[conda] torchmetrics 0.3.2 pypi_0 pypi
[conda] torchvision 0.8.2 py39_cu110 pytorch

onnx: ‘1.10.2’
onnxruntime: ‘1.10.0’

Thanks. Looks like torchaudio.pipelines is introduced in 0.10.0 version. Could you update PyTorch and torchaudio and re-test the script?

Sorry for late reply, I reported a different version of my environment, sorry for that, but I updated torch packages according to torchaudio==0.10.1, however nothing changed.

Now, my environment is :

Versions of relevant libraries:
[pip3] numpy==1.21.2
[pip3] torch==1.10.1
[pip3] torchaudio==0.10.1
[pip3] torchvision==0.11.2
[conda] blas                      1.0                         mkl  
[conda] cudatoolkit               11.3.1               h2bc3f7f_2  
[conda] ffmpeg                    4.3                  hf484d3e_0    pytorch
[conda] mkl                       2021.4.0           h06a4308_640  
[conda] mkl-service               2.4.0            py38h7f8727e_0  
[conda] mkl_fft                   1.3.1            py38hd3c417c_0  
[conda] mkl_random                1.2.2            py38h51133e4_0  
[conda] numpy                     1.17.4                   pypi_0    pypi
[conda] numpy-base                1.21.2           py38h79a1101_0  
[conda] pytorch                   1.10.1          py3.8_cuda11.3_cudnn8.2.0_0    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torch                     1.7.1                    pypi_0    pypi
[conda] torchaudio                0.10.1               py38_cu113    pytorch
[conda] torchvision               0.11.2               py38_cu113    pytorch

Can you share your env info? So I can change packages one by one according to yours ?