# RuntimeError: Given groups=1, weight of size [1, 1, 512, 0], expected input[2, 10, 1, 500] to have 1 channels, but got 10 channels instead. Trying to compute the cross-correlation

vin · March 2, 2023, 2:31pm

Hello，I’m facing an issue raising this message: corr_idex = F.conv2d(signal, reference_signal_tensor) RuntimeError: Given groups=1, weight of size [1, 1, 512, 0], expected input[2, 10, 1, 500] to have 1 channels, but got 10 channels instead.
For the reminder, I’m trying to compute the cross-correlation using one layer neural network conv2d, aiming to align the signal between the original signal and reference signal. Unfortunately, I can’t run my code because the tensors size of my original signal and reference signal don’t match.
Here is tensors’ sizes:
original signal : torch.Size([2, 10, 1, 500
reference signal : torch.Size([1, 1, 512, 0])

Therefore, my question is: How can I reshape the tensors and make them match? I don’t know if my problem is because I didn’t correctly unsqueeze the signal.
Need your help. Thanks.

Here is my cross-corralation code:

def _signal_align(self, signal, reference_signal):
        
        signal = torch.tensor(signal)

        reference_signal_tensor = torch.tensor(reference_signal)

        signal = signal.unsqueeze(2)

        reference_signal_tensor = reference_signal_tensor.unsqueeze(0).unsqueeze(0)

        reference_signal_tensor = torch.flip(reference_signal_tensor, dims=[2])

        # cross-correlation
        corr_idex = F.conv2d(signal, reference_signal_tensor)
        # shift btw signal and reference
        shift_signal = max_corr_idx - (len(signal) - 1)
        # roll function used to shift the signal by calculated shift
        align_signal = torch.roll(signal, shifts=int(shift_signal), dims=0)
        return align_signal

ptrblck · March 2, 2023, 8:52pm

Your reference is empty:

x = torch.randn([1, 1, 512, 0])
print(x)
# tensor([], size=(1, 1, 512, 0))

so the output will also be empty.

vin · March 3, 2023, 6:32am

After some changes, I got this message RuntimeError: Given groups=1, weight of size [1, 1, 512, 511], expected input[2, 10, 1, 500] to have 1 channels, but got 10 channels instead. So now, my reference signal is not empty again right? How can I match the channels and the dimensions, just by reshaping with: signal.reshape(-1) or signal.flatten(-1) or any suggestion. Thanks

ptrblck · March 3, 2023, 7:24am

No, .reshape or .view operations could interleave your tensor and create an invalid input.
Use .permute instead to permute the dimensions of a tensor.
Assuming dim2 in [2, 10, 1, 500] represents the channel dimension, you could use:

x = x.permute(0, 2, 1, 3)

but make sure it is indeed the channel dimension.

vin · March 3, 2023, 7:39am

Sir, this is after using the ..permute(). Unfortunately, I got this: corr_idex = F.conv2d(signal, reference_signal_tensor) RuntimeError: Calculated padded input size per channel: (10 x 500). Kernel size: (512 x 511). Kernel size can't be greater than actual input size. What should be the difference if I use the F.conv1d() rather than F.conv2d()? If I switch my network to F.conv1d() I suddenly got this message: RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [2, 1, 10, 500] where I consider the dimension of the reference_signal after flipping at dim=1. How can I make it run well?

    def _signal_align(self, signal, reference_signal):
       
        signal = torch.tensor(signal)
        reference_signal_tensor = torch.tensor(reference_signal)
        signal = signal.unsqueeze(2)
        signal = torch.mean(signal, dim=2, keepdim=True)
        signal = signal.permute(0, 2, 1, 3)
        print(signal.size())
        reference_signal_tensor = reference_signal_tensor.unsqueeze(0).unsqueeze(0)
        print(reference_signal_tensor.size())
        reference_signal_tensor = torch.flip(reference_signal_tensor, dims=[2])
        print(reference_signal_tensor.size())
        # cross-correlation
        corr_idex = F.conv2d(signal, reference_signal_tensor)
        # # find index
        max_corr_idx = torch.argmax(corr_idex)
        # shift btw signal and reference
        shift_signal = max_corr_idx - (len(signal) - 1)
        # roll function used to shift the signal by calculated shift
        align_signal = torch.roll(signal, shifts=int(shift_signal), dims=0)
        return align_signal

ptrblck · March 3, 2023, 10:18am

Try to change the order of input arguments as your current weight is larger than the input:

a = torch.randn([1, 1, 512, 511])
b = torch.randn([2, 1, 10, 500])

# works since kernel is smaller than input
F.conv2d(a, b)

# fails
F.conv2d(b, a)
# RuntimeError: Calculated padded input size per channel: (10 x 500). Kernel size: (512 x 511). Kernel size can't be greater than actual input size

F.conv2d expects a 4-dimensional input in the shape [batch_size, channels, height, width] and uses a 4-dimensional filter in the shape [out_channels, in_channels, height, width] while F.conv1d uses a 3-dimensional inputs and filters by replacing height and width with the sequence_length.

vin · March 3, 2023, 12:46pm

I got this after changing the order : corr_idex = F.conv2d(reference_signal_tensor, signal) RuntimeError: Input type (torch.FloatTensor) and weight type (CPUComplexDoubleType) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

vin · March 3, 2023, 1:27pm

Please Sir, can you check this update:

  def _signal_align(self, signal, reference_signal):
        # Convert input signals to complex-valued tensors
        signal = torch.tensor(signal).unsqueeze(-1)
        reference_signal_tensor = torch.tensor(reference_signal)
        reference_signal_tensor = reference_signal_tensor.flip(0).unsqueeze(0).unsqueeze(-1).float()
        signal = signal.permute(0, 2, 1, 3)
        corr_idex = F.conv2d(reference_signal_tensor, signal, padding=1, stride=1)
        # Find the index of the maximum correlation
        max_corr_idx = torch.argmax(torch.abs(corr_idex))
        # Compute the signal shift and align the signal
        shift_signal = max_corr_idx - (len(signal) - 1)
        align_signal = torch.roll(signal, shifts=int(shift_signal), dims=0)
        # Transpose the aligned signal tensor back to the original shape
        align_signal = align_signal.transpose(2, 3).transpose(1, 2).squeeze(-1)

        return align_signal

But got this error: corr_idex = F.conv2d(reference_signal_tensor, signal, padding=1, stride=1) RuntimeError: Given groups=1, weight of size [2, 500, 10, 1], expected input[1, 512, 511, 1] to have 500 channels, but got 512 channels instead

ptrblck · March 3, 2023, 7:00pm

The previous error is raised by a dtype and device mismatch, but it seems you have already solved it.
The new one is raised by an invalid shape as F.conv2d expects an input in the shape [batch_size, channels, height, width] and a filter weight in the shape [out_channels, in_channels, height, width] in the default setup.
Could you explain what each dimension of your input and filter represents?

vin · March 4, 2023, 9:08am

Hi Sir. As I said before, this script will help to align the signal to compute the cross correlation. However, the input signal is none than the collected signal from a device which represent the 2-dimension signal enable to perform the convolution network (conv2d). From the script, I directly call the convolution network without designing a proper or complet model as we usually do. This code corr_idex = F.conv2d(reference_signal_tensor, signal, padding=1, stride=1) is directily call and assuming both the reference and original signal. Unfortunately, after calculation, the reference signal is in dim=1 which may mismatch with the original_signal.

So, If I can get the point, isn’t possible by directly calling the convolution itself, right? Should I design a small network which consider only one layer and call it first into the signal, and start aligning-to-correlation step? Is this request a huge data, or by considering one-by-one signal the one layer network can perform good result? The goal here is to align each single one-by-one before to reach the last step that use a network to classify all data for prediction and analyse.

Request for your suggestion. Or should I consider traditional methods to align
and correlate the signal? Need your help