Dataset with last dimension as channel

Hi, I have a dataset that has the channel dimension as last, obviously pytorch find it not suitable for the CNN that i’m usinng

   self.conv2d = nn.Sequential(
                nn.Conv2d(1, 64, (3,6), (1, 1)),
                nn.ReLU()
            )

the dataset has as dimensions: samples x 10 x 6 x 1.

What should i do? For now I’m transposing the data but the network is not performing well.

Permuting the output is the right approach.
What do you mean by “not performing well”? Is the model bad regarding the speed or accuracy?

You could use the channels_last memory format as described here, which internally permutes the data to the channels last format. Note that the shape of the tensor would still indicate the standard contiguous format (channels first).

Thanks for answering, so the model is bad regarding the accuracy.
So by using the channel_last i just need the dataset without permuting it, right?

Yes, as shown in the linked tutorial.

Note that changing the memory layout will not fix the accuracy issue, so you might want to fix this first e.g. by playing around with some hyperparameters.

I will try to do that too, right away. For now, thanks.

So I’m using the channel_last memory format in my custom dataloader but i got this error
RuntimeError: Given groups=1, weight of size 64 1 3 6, expected input[2048, 10, 6, 1] to have 1 channels, but got 10 channels instead
so apparently is not working

for reference, this is my dataloader

import torch

from torch.utils import data

class Hdf5_dl(data.Dataset):
    """HDF5 datasets"""

    def __init__(self, archive, transform = None):
        self.archive = h5.File(archive, 'r')
        self.labels = torch.tensor(self.archive['labels'])
        self.data = torch.tensor(self.archive['data']).contiguous(memory_format=torch.channels_last)
        self.transform = transform

    def __getitem__(self, index):
        sample = self.data[index]

        if self.transform is not None:
            sample = self.transform(sample)

        return sample, self.labels[index]

    def __len__(self):
        return len(self.labels)

    def close(self):
        self.archive.close()

It’s working for me:

conv = nn.Conv2d(1, 64, (3, 6)).cuda().to(memory_format=torch.channels_last)
x = torch.randn(2048, 1, 10, 6).cuda().to(memory_format=torch.channels_last)
out = conv(x)
print(out.shape)
> torch.Size([2048, 64, 8, 1])

As said before, you should not manually permute the tensor, but handle it in the “standard” contiguous layout (channels-first or NCHW).

so right now what i’m doing is to mimic your code like that:

model = Network()
model = model.to(memory_format=torch.channels_last)
model = model.double()
criterion = nn.BCELoss()
optimizer = optim.Adam(params = model.parameters(), lr = 0.01)

for epoch in range(epochs):  # loop over the dataset multiple times

    model.train()
    for j, data in enumerate(trainloader, 0):
        # Get the inputs; data is a list of [inputs, labels]
        inputs, labels = data
        print(inputs.stride())
        inputs = inputs.to(memory_format=torch.channels_last)
        print(inputs.stride())

        # Zero the parameter gradients
        optimizer.zero_grad()

        # Forward + Backward + Optimize
        outputs = model(inputs.double())
        loss = criterion(outputs, labels.double())
        loss.backward()
        optimizer.step()

        print("epoch\t{}\t\tbatch\t{}\nloss\t{}\n---".format(epoch, j, loss.item()))

and is still not working, i even tried to print out the stride before and after the channel_last operation and this is what i get:

(60, 6, 1, 1)
(60, 6, 1, 1)

Error:

RuntimeError: Given groups=1, weight of size 64 1 3 6, expected input[2048, 10, 6, 1] to have 1 channels, but got 10 channels instead

Could you print the shape of inputs? It should be [batch_size, channels, height, width].

So the shape of the input is [2048, 10, 6, 1] even if I convert the memory format like this:

 for j, data in enumerate(trainloader, 0):
        inputs, labels = data
        inputs = inputs.to(memory_format=torch.channels_last)
        print("\ninput: ", inputs.shape)

As described before: your input has to be in the shape [batch_size, channels, height, width] before using to(memory_format=torch.channels_last) (have another look at my code snippet).
You should not manually permute the tensor to the channels-last format, the to() operation will internally handle it for you.

Since the conv layer has in_channels=1, your data should have the shape [2048, 1, 10, 6].

I think i didn’t explained my feld properly, my dataset from the beginning is [batch_size, height, width, channels]

so the original shape of data is [2048, 10, 6, 1]

In that case you have to permute it, so that the shape is channels-first and apply the memory_format later.

Assuming original inputs are contiguous, tensors will become channels_last automatically after permute call.

inputs, labels = data
inputs = inputs.permute(0,3,1,2)
print("\ninput: ", inputs.shape, inputs.is_contiguous(memory_format=torch.channels_last))
1 Like