Source and weight input channels mismatch

HurstSi · May 25, 2022, 6:15am

I set the device to ‘mps’
but it shows:

/AppleInternal/Library/BuildRoots/b6051351-c030-11ec-96e9-3e7866fcf3a1/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Runtimes/MPSRuntime/Operations/GPUConv2DOps.mm:214: failed assertion `Source and weight input channels mismatch'

and then the program quit.

it also continuous showed me that:

anaconda3/envs/Unet/lib/python3.8/site-packages/torch/amp/autocast_mode.py:198: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')

What’s wrong with this. How can I solve it. Thank you.

HurstSi · May 25, 2022, 6:16am

/AppleInternal/Library/BuildRoots/b6051351-c030-11ec-96e9-3e7866fcf3a1/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Runtimes/MPSRuntime/Operations/GPUConv2DOps.mm:214: failed assertion `Source and weight input channels mismatch’

albanD · May 26, 2022, 1:20pm

Hi,

Are you using AMP? It does not support mps right now I’m afraid (We didn’t had time to work on that yet).
Also the channel mismatch error seems unrelated to mps. Do you get the same thing if you run on CPU?

HurstSi · May 27, 2022, 5:20am

Yes, I’m using the amp. Really thankful for your contribution.
Maybe there’s something with my code, I didn’t get the mismatch error when I used the device “cpu”

albanD · May 27, 2022, 11:55am

Can you try to run without AMP on MPS and see if that works?

JeremyM · August 26, 2022, 11:18pm

I am getting this same error. As far as I’m aware I’m not using AMP. I do not get the same error if I run on CPU, everything runs fine until I use MPS with 2D convolution.

JeremyM · August 26, 2022, 11:46pm

I was able to write a minimal reproducer.

import numpy as np
device = torch.device("mps")

conv = torch.nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3).to(device)

data = torch.tensor(np.random.uniform(size=(1, 10, 10, 1)), dtype=torch.float32).to(device)
optimizer = torch.optim.Adam(conv.parameters(), lr=0.1)
x = data.permute(0, 3, 1, 2)
out = torch.sum(conv(x))
loss = torch.nn.MSELoss()(out, torch.zeros_like(out))
optimizer.zero_grad()
out.backward()
optimizer.step()

my output is

$ python3 reproducer.py
/AppleInternal/Library/BuildRoots/560148d7-a559-11ec-8c96-4add460b61a6/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Runtimes/MPSRuntime/Operations/GPUConv2DOps.mm:214: failed assertion `Source and weight input channels mismatch'
zsh: abort      python3 reproducer.py

If I switch to the “cpu” device it runs without error. My torch versions are

torch 1.13.0.dev20220826
torchaudio 0.13.0.dev20220826
torchvision 0.14.0.dev20220826

zhaoBowen612 · September 2, 2022, 4:24pm

You forgot to import torch.
I can also reproduce this issue. Could you please issue this in the repository of torch in Github?

JeremyM · September 3, 2022, 12:29am

github.com/pytorch/pytorch

source and weight input channels mismatch when using Conv2D on mps

opened 05:27PM - 29 Aug 22 UTC

mcgibbon

module: convolution triaged module: mps

### 🐛 Describe the bug When using 2D convolution on an M1 macbook with the mp…s device, python crashes. This can be reproduced with: ```python import numpy as np import torch device = torch.device("mps") conv = torch.nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3).to(device) data = torch.tensor(np.random.uniform(size=(1, 10, 10, 1)), dtype=torch.float32).to(device) optimizer = torch.optim.Adam(conv.parameters(), lr=0.1) x = data.permute(0, 3, 1, 2) out = torch.sum(conv(x)) loss = torch.nn.MSELoss()(out, torch.zeros_like(out)) optimizer.zero_grad() out.backward() optimizer.step() ``` Which on my machine outputs: ```bash $ python3 reproducer.py /AppleInternal/Library/BuildRoots/560148d7-a559-11ec-8c96-4add460b61a6/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Runtimes/MPSRuntime/Operations/GPUConv2DOps.mm:214: failed assertion `Source and weight input channels mismatch' zsh: abort python3 reproducer.py ``` If I run the example above with the "cpu" device, it exits without error. ### Versions ```bash $ python3 collect_env.py Collecting environment information... PyTorch version: 1.13.0.dev20220826 Is debug build: False CUDA used to build PyTorch: None ROCM used to build PyTorch: N/A OS: macOS 12.3.1 (arm64) GCC version: Could not collect Clang version: 13.1.6 (clang-1316.0.21.2) CMake version: version 3.23.1 Libc version: N/A Python version: 3.8.13 | packaged by conda-forge | (default, Mar 25 2022, 06:04:14) [Clang 12.0.1 ] (64-bit runtime) Python platform: macOS-12.3.1-arm64-arm-64bit Is CUDA available: False CUDA runtime version: No CUDA GPU models and configuration: No CUDA Nvidia driver version: No CUDA cuDNN version: No CUDA HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Versions of relevant libraries: [pip3] numpy==1.21.5 [pip3] torch==1.13.0.dev20220826 [pip3] torchaudio==0.13.0.dev20220826 [pip3] torchvision==0.14.0.dev20220826 [conda] numpy 1.21.5 py38hb29071a_0 conda-forge [conda] torch 1.13.0.dev20220826 pypi_0 pypi [conda] torchaudio 0.13.0.dev20220826 pypi_0 pypi [conda] torchvision 0.14.0.dev20220826 pypi_0 pypi ``` cc @kulinseth @albanD

Suryalok1 · October 29, 2022, 8:05am

I am also seeing the same issue.
It seems like the issue is with permute operation.

Try the below:

import numpy as np
device = torch.device(“mps”)

conv = torch.nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3).to(device)

data = torch.tensor(np.random.uniform(size=(1, 1, 10, 10)), dtype=torch.float32).to(device)

optimizer = torch.optim.Adam(conv.parameters(), lr=0.1)
x = data
out = torch.sum(conv(x))
loss = torch.nn.MSELoss()(out, torch.zeros_like(out))
optimizer.zero_grad()
out.backward()
optimizer.step()