Channels mismatch in Conv1D

Hi, dear community.

I’m fairly new in the field of M.L. and I’m trying to build a 1D autoencoder.

My data consist of 100 rows with 21 columns. In this scenario, each row is an individual sample.

My goal is to create a 1D autoencoder able to map these lines into a 19 collums representation, and after this back to 21 columns, but I’m getting the following channel mismatch error:

Code by: Caue Evangelista
Fully based on the code by Sebastian Raschka
Link: https://github.com/rasbt/stat453-deep-learning-ss21/tree/main/L16
Torch version:  2.0.1

Device: CPU
 
Dataset size:  100
Train size:  80
Teste size:  20

Training Set:
Tracksjets batch dimensions: torch.Size([32, 21])
Label dimensions: torch.Size([32])
 
Testing Set:
Tracksjets batch dimensions: torch.Size([20, 21])
Label dimensions: torch.Size([20])
 
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
Epoch: 001/020 | Batch 0000/0003 | Loss: 14.0723
Traceback (most recent call last):
  File "/home/ecaue/JetTrack/AutoEncoder/AutoencoderJetTrack/autoencoder.py", line 110, in <module>
    log_dict = train_autoencoder_v1(num_epochs=NUM_EPOCHS, model=model, 
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ecaue/JetTrack/AutoEncoder/AutoencoderJetTrack/helpers/helper_train.py", line 36, in train_autoencoder_v1
    logits = model(features)
             ^^^^^^^^^^^^^^^
  File "/home/ecaue/.conda/envs/AnalysisEnv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ecaue/JetTrack/AutoEncoder/AutoencoderJetTrack/autoencoder.py", line 98, in forward
    x = self.encoder(x)
        ^^^^^^^^^^^^^^^
  File "/home/ecaue/.conda/envs/AnalysisEnv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ecaue/.conda/envs/AnalysisEnv/lib/python3.11/site-packages/torch/nn/modules/container.py", line 217, in forward
    input = module(input)
            ^^^^^^^^^^^^^
  File "/home/ecaue/.conda/envs/AnalysisEnv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ecaue/.conda/envs/AnalysisEnv/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 313, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ecaue/.conda/envs/AnalysisEnv/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 309, in _conv_forward
    return F.conv1d(input, weight, bias, self.stride,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Given groups=1, weight of size [32, 32, 2], expected input[1, 16, 21] to have 32 channels, but got 16 channels instead

I’m very confused because I have a Google Colab notebook for quick testings and there the code runs ok, but every time I try to run my official scrip, I got these messages.

This is the code on my “official script”:

# Hyperparameters
RANDOM_SEED = 123
LEARNING_RATE = 0.0005
BATCH_SIZE = 32
NUM_EPOCHS = 20
NUM_CLASSES = 3

#Kernel and Stride
kernel=2
stride=1

set_deterministic
set_all_seeds(RANDOM_SEED)

##########################
### DATASET
##########################


ds = JetTrackDataset()
train_loader, test_loader = get_dataloaders_jetrack(ds,batch_size=32,num_workers=2)


# Checking the dataset
print('Training Set:')
for tracksjets, labels in train_loader:  
    print('Tracksjets batch dimensions:', tracksjets.size())
    print('Label dimensions:', labels.size())
    break

print(" ")

# Checking the dataset
print('Testing Set:')
for tracksjets, labels in test_loader:  
    print('Tracksjets batch dimensions:', tracksjets.size())
    print('Label dimensions:', labels.size())
    break

print(" ")
##########################
### MODEL
##########################

class AutoEncoder(nn.Module):
    def __init__(self):
        super().__init__()
        
        self.encoder = nn.Sequential(
            nn.Conv1d(BATCH_SIZE, BATCH_SIZE, kernel_size=kernel, stride=stride),
            nn.LeakyReLU(0.01)               

        )
        self.decoder = nn.Sequential(
            nn.ConvTranspose1d(BATCH_SIZE, BATCH_SIZE, kernel_size=kernel, stride=stride),
            nn.LeakyReLU(0.01),
                                 
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

and the code on my colab notebook:

#Number of samples
batch_size = 32

#Number of collums in each sample
collumns = 21

kernel = 2
stride = 1

encoder = nn.Sequential(
                        nn.Conv1d(batch_size, batch_size,  kernel_size=kernel, stride=stride),
                        nn.LeakyReLU(0.01),
                        nn.Conv1d(batch_size, batch_size,  kernel_size=kernel, stride=stride)
                       )

decoder = nn.Sequential(
                        nn.ConvTranspose1d(batch_size, batch_size, kernel_size=kernel, stride=stride),
                        nn.LeakyReLU(0.01),
                        nn.ConvTranspose1d(batch_size, batch_size, kernel_size=kernel, stride=stride)
                       )

input = torch.randn(batch_size, I)
print("Input:", input.size())

code = encoder(input)
print("Code:", code.size())

output = decoder(code)
print("Output:", output.size())

which produce:

Input: torch.Size([32, 21])
Code: torch.Size([32, 19])
Output: torch.Size([32, 21])

I’ve read quite a bit of other posts about this but still, I’m very confused about all of the indices(in_channels, out_channels, batch_size, what is the right order, how many channels I need to do what I want, etc) and I apologize in advance if my question it’s not appropriated. I really appreciate any insight, help, or recommendation.

You are initializing the conv layers with the batch_size for their input and output channels, which is wrong.
Conv layers should define the in_channels as the number of channels from their input activations and define the out_channels of the output activation (which corresponds to the channels [or feature maps] of the output).
The input activation should have the shape [batch_size, channels, seq_length] for nn.Conv1d, where again channels of the input corresponds to in_channels of the conv layer.

Your current input is only 2-dimensional and actually lacks a dimension as described before.
However, in recent PyTorch versions nn.Conv1d also accepts unbatched inputs in the shape [channels, seq_length]. In this case the batch dimension will be unsqueezed for you and the input will be treated as a single sample.
This also does not match your use case since you are explicitly defining your input as [batch_size, I], so a dimension (either channels or sequence length) is missing.

1 Like

@ptrblck I’m sorry, I don’t know if I understood you correctly (most probably not).

Something is lacking in my input, so maybe the problem is with the data loaders? I’m creating the datasets using the class and function:

class JetTrackDataset(Dataset):
    # This loads the data and converts it
    def __init__(self):
        #Load data
        df=pd.read_csv("encoder_demo.txt", sep=" ")
        self.df=pd.read_csv("encoder_demo.txt", sep=" ")


        #Substitute label letters by numbers(l=0, c=1, b=2)
        df['flavour'] = df['flavour'].replace(['l', 'c', 'b'], ['0', '1', '2'])
        self.df['flavour'] = self.df['flavour'].replace(['l', 'c', 'b'], ['0', '1', '2'])

        #Extract labels
        self.df_labels=df[['flavour']]
        self.df=df.drop(columns=['flavour'])

        #Convert to torch dtypes
        self.dataset = torch.tensor(self.df.to_numpy()).float()
        self.labels  =  torch.tensor(self.df_labels.to_numpy().reshape(-1)).long()

    # This returns the total amount of samples in the dataset
    def __len__(self):
        return len(self.dataset)

    # This returns given an index the i-th sample and label
    def __getitem__(self, idx):
        return self.dataset[idx],self.labels[idx]

def get_dataloaders_jetrack(dataset, batch_size, num_workers):

    train_size = int(0.8 * len(dataset))
    test_size  = len(dataset) - train_size

    print(" ")
    print("Dataset size: ", len(dataset))
    print("Train size: ", train_size)
    print("Teste size: ",test_size)
    print("")


    train_dataset, test_dataset  = torch.utils.data.random_split(dataset,[train_size,test_size])

    train_loader = DataLoader(dataset=train_dataset, batch_size=batch_size, num_workers=num_workers, shuffle=True)
    test_loader  = DataLoader(dataset=test_dataset, batch_size=batch_size,  num_workers=num_workers, shuffle=False)

    return train_loader, test_loader

I also changed in_channels and out_channels to 1, in a reference to the number of channels of a grey-scale image, but got a different mismatch:

RuntimeError: Given groups=1, weight of size [1, 1, 2], expected input[1, 32, 21] to have 1 channels, but got 32 channels instead

You are still passing 2-dimensional inputs which are treated as unbatched inputs in [channels, seq_length]. Take a look at this example:

batch_size = 2
channels = 1
seq_length = 16

conv = nn.Conv1d(channels, 32, 3, 1, 1)

# batched input with an explicit batch dimension
x = torch.randn(batch_size, channels, seq_length)

out = conv(x)
print(out.shape)
# torch.Size([2, 32, 16]) = [batch_size, out_channels, seq_length]

# unbatched input without a batch dimension
x = torch.randn(channels, seq_length)
out = conv(x)
print(out.shape)
# torch.Size([32, 16]) = [out_channels, seq_length]

# wrong unbatched input with a missing dimension
x = torch.randn(batch_size, seq_length)
out = conv(x)
# RuntimeError: Given groups=1, weight of size [32, 1, 3], expected input[1, 2, 16] to have 1 channels, but got 2 channels instead

which explains the expected input shape desribed before.

1 Like

First, let’s check on your data. Conv1d layers are appropriate for sequential data. That means the order is important on that dimension. For example, if this were a sentence, order becomes important, unless you want to sound like Yoda. Compare that with channels, where order is irrelevant. For example, in an image, it doesn’t matter if you have RGB, GBR, BRG, etc. Red, blue and green then meet the definition of channels.

So, let me ask, along the 100 rows, is this a type of sequential data?

When putting the data into the model, as Patrick mentioned, the dims should be [batch_size, channels, sequence].

If the rows are sequence, that means the columns are likely channels(unless order matters for those, too, and in that case, you should use a Conv2d). So your in_channels should be 21. And you want 19 channels internally, so your first layer out_channels should be 19. Then the second layer should be 19 and 21, respectively.

1 Like

You seem to be missing the 100 rows of data. Maybe you’re just getting the first row?

1 Like

Thank you @ptrblck, this gave me something more concrete to compare to what I’m trying to do

Hi @J_Johnson!

Each row is the output of 21 sensors of a particle detector, so we can imagine each row as a different measurement where the label is the type of measurement obtained in simulation software (in this case Monte Carlo Truth).

What I’m trying to do is to reduce the dimension of my input successively (21 to 19 in one layer, 19 to 17 in another one,…) to obtain at the end of the day a latent representation that allows me to make a distinction between the types of measurements (I know some types are very similar but not really the same).

Since each data point is a row, I thought that maybe Conv1D would do the trick.

Reading what @ptrblck and you said, I think the issue is in my Dataset Class/Dataloader Function since the output is something like

torch.Size([32, 21])

when it should be

torch.Size([32,1,1 21]) = torch.Size([batch_size,channels_per_sample,rows_in_sample, collumns_in_ sample])

but I’m still trying to figure it out where things have gone south

It sounds like the output should be [32, 21, 100].

If you plan to feed the data at each time step, for example, size [32, 21, 1], then you might want to look at recurrent neural networks(i.e. LSTM, GRU, RNN, etc.).

A Conv1d can handle time sequential data fine, as well, but just needs to be fed all timesteps.

However, if there is no timesteps, i.e. the 100 rows are independent samplings, then you might just need a few Linear layers.

1 Like

Hi @J_Johnson!

I think that maybe I’ve got my mind wrapped around this issue.

My Dataset Class was producing 2d inputs of the form [batch_size, 21]. I’ve modified it to produce [batch_size, 1, 21] objects since I have just one channel, similar to a grey-scale image:

        #Fix the dim of the data
        self.data = self.data[:, None, :]

In parallel I’ve changed my model:

class AutoEncoder(nn.Module):
    def __init__(self):
        super().__init__()
        
        self.encoder = nn.Sequential(
            nn.Conv1d(in_channels = 1, out_channels = 32, kernel_size=kernel, stride=stride),
            nn.LeakyReLU(0.01)               

        )
        self.decoder = nn.Sequential(
            nn.ConvTranspose1d(in_channels = 32, out_channels = 1, kernel_size=kernel, stride=stride),
            nn.LeakyReLU(0.01),
                                 
        )

    def forward(self, x):
        x = self.encoder(x)
        x = self.decoder(x)
        return x

Now things are running and I can implement a more realistic A.E.

Thanks a lot, @ptrblck and @J_Johnson !!!