CIFAR10 Image Channels Mis-Match. Needing Help

icmp_no_request · September 17, 2019, 8:13am

Hi, I use unpickle to read image info of cifar10 and put the data together as an <np.ndarray> with shape = (50000, 3). Each column represents data, label and category respectively.

I build my own cifar10_data_loader as the roadmap:
ndarray -> tensor -> Dataset -> DataLoader

Here is my code for preprocessing image data:

def ndarray_to_tensor(ndarray):
    """
    It is a function to transform <np.ndarray> to <tensor>
    :param ndarray: np.ndarray
    :return: tensor object
    """
    out = torch.Tensor(list(ndarray))
    return out


def form_dataset(X_tensor, y_tensor):
    """
    It is a function to form dataset.
    :param X_tensor: tensor x
    :param y_tensor: tensor y
    :return: dataset
    """
    dataset = TensorDataset(X_tensor, y_tensor)
    return dataset


def form_dataloader(tensor_dataset, batch_size, num_workers):
    """
    It is a function to form data loader from tensor dataset.
    :param tensor_dataset: tensor dataset
    :return: data loader
    """
    data_loader = DataLoader(tensor_dataset, batch_size=batch_size, num_workers=num_workers, shuffle=True)
    return data_loader


def train_target_model(ndarray_path, trainset_size, net, num_epochs, criterion, optimizer, device, dataset_path):
    """
    It is a function to train the target model
    :param net: neural network
    :param train_loader: train data loader
    :param num_epochs: number of epochs
    :param criterion: loss function
    :param optimizer: optimizer
    :param device: device (cuda or cpu)
    """
    # 1. Load norm_all_batch_data array
    norm_all_data_array = (np.load(ndarray_path, allow_pickle=True))    # (5, 10000, 3)
    concatenate_norm_array = np.concatenate((norm_all_data_array))      # (50000, 3)

    # 2. Extract train set array
    trainset_array = concatenate_norm_array[:trainset_size, :]      # (trainset_size, 3)
    # print(trainset_array)

    X = trainset_array[:, 0]            # Data, shape = (trainset_size, )
    y = trainset_array[:, 1]            # Label, shape = (trainset_size, )

    # 3. ndarray -> tensor
    X_tensor = ndarray_to_tensor(ndarray=X)
    y_tensor = ndarray_to_tensor(ndarray=y)

    # 4. tensor -> dataset
    target_train_dataset = form_dataset(X_tensor=X_tensor, y_tensor=y_tensor)

    # 5. dataset -> data loader
    target_train_loader = form_dataloader(tensor_dataset=target_train_dataset,
                                          batch_size=batch_size,
                                          num_workers=num_workers)
    # 6. train
    train(net=net,
          train_loader=target_train_loader,
          num_epochs=num_epochs,
          criterion=criterion,
          optimizer=optimizer,
          device=device)


def main():

    # 1. Define target neural network
    target_net = CNN_CIFAR10()

    # 2. Define loss function
    criterion = nn.CrossEntropyLoss()

    # 3. Define optimizer
    optimizer = optim.Adam(target_net.parameters(), lr=learning_rate)

    # 4. Train target model
    train_target_model(ndarray_path="cifar10/norm_all_batch_data.npy",
                       trainset_size=2500,
                       net=target_net,
                       num_epochs=num_epochs,
                       criterion=criterion,
                       optimizer=optimizer,
                       device=device,
                       dataset_path=dataset_path)

Here is my CNN model, like LeNet-5:

class CNN_CIFAR10(nn.Module):
    """
    Local Target Model: A standard convolutional neural network.
    - 2 Conv layer
    - 2 Max pooling layer
    - 1 FC layer
    - 1 Softmax layer
    - Tanh as the activation function
    """
    def __init__(self):
        super(CNN_CIFAR10, self).__init__()

        self.convnet = nn.Sequential(
            # Conv1
            nn.Conv2d(in_channels=3, out_channels=6, kernel_size=5),
            nn.Tanh(),

            # MaxPool1
            nn.MaxPool2d(kernel_size=2, stride=2),

            # Conv2
            nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5),
            nn.Tanh(),

            # MaxPool2
            nn.MaxPool2d(kernel_size=2, stride=2))

        self.fc = nn.Sequential(
            # FC1
            nn.Linear(in_features=16 * 5 * 5, out_features=128),
            nn.Tanh(),

            # FC2
            nn.Linear(in_features=128, out_features=10),
            nn.Softmax())

    def forward(self, x):
        output = self.convnet(x)
        output = output.view(output.size[0], -1)
        output = self.fc(output)

        return output

However, when I run the main() function above, I encountered this error RuntimeError: Given groups=1, weight of size 6 3 5 5, expected input[100, 32, 32, 3] to have 3 channels, but got 32 channels instead

Could somebody help me fix this issue? Thanks in advance!

vainaijr · September 17, 2019, 9:50am

I think input should be of size [100, 3, 32, 32], it is batch_size, channels, height, width.
nn.Conv2d, in_channels=3, expects it in this order.

icmp_no_request · September 18, 2019, 1:00am

Thanks for your help. I’m new to Pytorch, could you please tell me where to modify my code?

icmp_no_request · September 18, 2019, 1:31am

My array X = trainset_array[:, 0] with the shape of (2500, ) and it saved with the 32x32x3 image info.

When I use torch.Tensor(list(ndarray)), X_tensor.size() = torch.Size([2500, 32, 32, 3]), I wonder how can I modify the in_channels=32 to in_channels=3 ?

If I don’t wanna modify the model code, how can I modify the array? Thanks in advance!

vainaijr · September 18, 2019, 2:54am

I think permute could be used here,
X_tensor = X_tensor.permute(0, 3, 1, 2)

icmp_no_request · September 19, 2019, 2:38am

Thank you so much. Finally solved by reshape tensor