How to make my CNN batch size independent


I try to make a CNN but somehow the batch_size is interpeted as part of the input dimension.

Here is my main function.

if __name__=='__main__':
    # Set options
    device = 'cpu'

    # Parameters
    learning_rate = 0.01
    batch_size = 2
    epochs = 4

    # Generate training and validation data
    N = [10, 10] # 100 waveforms, 100 noise
    trainingData = generate_data(N)
    validationData = generate_data(N)
    # Get dataset objects
    TrainDS = Dataset(*trainingData)
    ValidDS = Dataset(*validationData)

    # Get Dataloaders
    TrainDL =, batch_size=batch_size, shuffle=True)
    ValidDL =, batch_size=batch_size, shuffle=True)
    # Get model
    model = get_model().to(device)

    # Get loss function
    loss_fn = nn.BCEWithLogitsLoss()

    # Get optimizer
    optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)
    # Epochs
    for t in range(epochs):
        print(f"\nEpoch {t+1} of {epochs - 1}:")
        train(TrainDL, model, loss_fn, optimizer)
        evaluation(ValidDL, model)


My goal is to detect a signal inside noise. I won’t explain the details of the generated data since it doesn’t matter but in the end we get a time series representing 1s of signal with a sampling rate of 2048. Meaning: One sample is a vector of size 2048.

generate_data() takes an argument N whereas N is a “tuple”. We get 10 time series representing a waveform that’s only noise and 10 time series representing a waveform that is a noisy signal.
generate_data() returns samples, labels. A label tells us if the corresponding sample contains a signal or not.

so trainingData is a tuple (samples, labels) of type <class 'tuple'> and samples as well as labels is of size 10+10=20 and of type <class 'list'>.

Now the Dataset Object is very simple

class Dataset(
    def __init__(self, samples, labels):
        assert len(samples) == len(labels)
        self.samples = samples
        self.labels = labels
    def __len__(self):
        return len(self.samples)
    def __getitem__(self, i):
        return self.samples[i], self.labels[i]

After we have the data sets, we get the data loaders.

Note: I know that I could generate the data directly in the data set object “constructor”.

Then we get the model, loss function as well as an optimizer and make a loop for epochs.

def get_model():
    # Sample rate
    sr = 2048
    return torch.nn.Sequential(        # Shapes
        nn.Conv1d(1, 32, 16),    # 32 x sr
        nn.Linear(sr, 2),

Note: The dimensions here probably are wrong. I’m still a bit confused about the CNN. I’ll approach this problem once I solved the one described below.

In each epoch, we train and evaluate the training.

def train(TrainDL, model, loss_fn, optimizer):
    # Put model in train mode
    # TrainDL is an iterator. Each iteration gives us a bunch of
    # (samples, labels). The size of (samples, labels) depends on batch_size.
    # We put the TrainDL iterator in an enumerate to get the key/value pair.
    # i_batch describes which batch we currently work through.
    # (samples, labels) is the actual data
    for i_batch, (samples, labels) in enumerate(TrainDL):
        print(f"Batch number {i_batch}")
        # Send data to device
        samples =
        labels =

        # Reset gradients TODO: Why?

        # Compute prediction
        labels_pred = model(samples)

        # Compute loss
        loss = loss_fn(labels_pred, labels)

        # Backpropagation

        # Make a step in the optimizer

Now I get the following error:

RuntimeError: Expected 3-dimensional input for 3-dimensional weight [32, 1, 16], but got 2-dimensional input of size [2, 2048] instead

Note that I have set the batch size to 2 and each sample is of size 2048, so each step in our iterator from our data loaders returns two of our samples i.e. a list of size [2, 2048].

samples in the training loop has type <class 'torch.Tensor'>.

So we actually pass something of dimension [2, 2048] to our CNN but I can’t see how this is wrong. I somehow assume there some magic going on s.t. our model would know that the we have a batch size of 2.

So I’m confused about the error.

So as I mentioned, 1D convolutions expect a 3D input as NxCxT.
In your case your N=2 and C=1.
You can simply expand the tensor. For example, you can project any line in a 2D plane, but this line can live in a N-dimensional space.
In short,

    def __init__(self, samples, labels):
        assert len(samples) == len(labels)
        self.samples = samples
        self.labels = labels
    def __len__(self):
        return len(self.samples)
    def __getitem__(self, i):
        return self.samples[i].unsqueeze(0), self.labels[i].unsqueeze(0)

Your signal which is a 1-D signal of size 2048, must be converted into a 2-D signal of size 1x2048 to use the convention ChannelsxTime. You must do this even if you have a single channel as in your case.
Lastly, the dataloader concatenate several of these 2-D signals to create a batch of size Nx1X2048.

all conv layers are batch independent

nn.Conv1d(1, 32, 16) means 1 input channel, 32 output channels, kernel size = 16. Thus it expects tensor with shape (X, 1, (at least 16)), where X is some amount of elements (batch with size at least 1), 1 is number of input channels, (at least 16) is your input data per channel, should be equal to or larger than kernel size.

You are passing 2 elements with shape (2048), .unsqueeze(1) this tensor at dim=1 to convert your input data to shape (2, 1, 2048)

PS. Kernel size 16 is rather large

1 Like

Thanks a lot. It seems I got it working now. I’ll reread a bit more about it but most of the confusion should be solved. :slight_smile:

Thanks. Don’t you think the axis should be 0 when unsqueezing?

if you unsqueeze at dim=0 your shape will be (1,2,2048) - (batch of 1, 2 channels, 2048 inputs)

With squeeze(0) I get torch.Size([3, 1, 2048]) which looks good to me?

So it’s not the same putting the unsqueesze in the dataset class that after that.
Dataloader cat several samples to form a batch.
So if you put it in the dataset class you are stucking 2 tensors of shape 1x2048.
If you unsqueeze once the dataloader returns the samples you get samples of 2x2048. Then it should be unsqueeze(1)

yup, you’re probably unsqueezing somewhere else :slight_smile:

yes, I got it thank.