The input array must be have a shape == (.., ..,[ ..,] 3)), got (500, 333)

Hi everyone! Could you help me with this error?

ValueError: the input array must be have a shape == (.., ..,[ ..,] 3)), got (500, 333)

There are some values of batch size and number of images in the dataset that the training step works, but for other values I get this error…I don’t understand why this is happening…

Another problem that I remembered now is that when I increase the batch size, this error occurs:

RuntimeError: CUDA out of memory. Tried to allocate 148.00 MiB (GPU 0; 8.00 GiB total capacity; 5.60 GiB already allocated; 140.97 MiB free; 280.24 MiB cached)

In my understanding, there is memory on my video card to use a batch of this size … I don’t understand why the error occurs …

Best regards,

Matheus Santos

For the first question, if you put a sample code, we can help you more. It is because of shape mismatching
For the second question, when you increase batch size your gpu memory is not enough for that. you need to decrease batch size. Some times your memory is busy with some other stuff and by rebooting maybe you can solve the problem. But in general you need to decrease batch size or size of network …

1 Like

Certainly! I will put here the code of the data loader and the train step.

In the Data Loader, L channel and a,b channels must be returned for the training step.

#####################################################
################## DATA LOADER #####################
#####################################################

from torchvision import datasets, transforms
from torch.utils.data import Dataset
from skimage.color import rgb2lab, rgb2gray
from skimage import io
import torch.utils.data as data
import torch
import torch.nn as nn
import torch.nn.functional as F
import numpy as np
from PIL import Image
import os

scale_transform = transforms.Compose([
    transforms.Resize((256,256),2),
    #transforms.RandomCrop(224),
    #transforms.ToTensor()
])

class CustomDataset(Dataset):
    """Custom Dataset."""

    def __init__(self, root_dir, transform=None):
        """
        Args:
            root_dir (string): Directory with all the images.
            transform (callable, optional): Optional transform to be applied
                on a sample.
        """
        self.root_dir = root_dir
        self.transform = transform
        self.file_list=os.listdir(root_dir)
        self.tensor_to_PIL = transforms.ToPILImage()

    def __len__(self):
        return len(self.file_list)

    def __getitem__(self, idx):
        img = Image.open(self.root_dir+'/'+self.file_list[idx]) # Read the image
        
        if self.transform is not None:
            img_rgb_resized = transforms.Resize((64,64),2)(img)
            img_rgb_ori = np.array(img)
            
            img_lab_ori = rgb2lab(img_rgb_ori) # Convert to CIE Lab color space
            img_lab_resized = rgb2lab(img_rgb_resized) # Convert to CIE Lab color space
            
            img_rgb_transposed = img_rgb_ori.transpose(2, 0, 1) #(C, W, H)    
            img_lab_transposed = img_lab_ori.transpose(2, 0, 1) #(C, W, H)
            img_lab_resized_transposed = img_lab_resized.transpose(2, 0, 1) #(C, W, H)
            
            img_l = (np.round(img_lab_transposed[0,:,:])).astype(np.int) # L channel
            img_l = self.tensor_to_PIL(img_l) # Convert to PIL object image to be possible apply the following transform
            img_l_resized = self.transform(img_l)
            img_l_resized = np.array(img_l_resized) # Convert to numpy array
            img_l_resized = torch.from_numpy(img_l_resized) # Convert to torch tensor
            
            img_ab_resized = (np.round(img_lab_resized_transposed[1:3, :, :])).astype(np.int) # (a,b) channels with int intensity values            
            img_ab_resized = np.array(img_ab_resized) # Convert to numpy array            
            img_ab_resized = torch.from_numpy(img_ab_resized) # Convert to torch tensor

            filename = self.root_dir+'/'+self.file_list[idx]
            
            return img_l_resized, img_ab_resized, filename # img_l_resized -> 1x256x256 and img_ab_resized -> 2x64x64
###################################################
################## TRAIN STEP #####################
###################################################
import os
import torch
import argparse
import torch.nn as nn
import torch.nn.functional as F
from torchvision import transforms
import numpy as np
from training_layers import PriorBoostLayer, NNEncLayer, ClassRebalanceMultLayer, NonGrayMaskLayer
from data_loader import TrainImageFolder, CustomDataset
from model import Color_model
np.set_printoptions(threshold=np.inf)

original_transform = transforms.Compose([
    transforms.Resize((256,256),2),    
])


def main(args):
    
    train_set = CustomDataset(args.image_dir, original_transform)

    data_loader = torch.utils.data.DataLoader(train_set, batch_size = args.batch_size, shuffle = True, num_workers = args.num_workers)

    model = Color_model().cuda()
    criterion = nn.CrossEntropyLoss().cuda()
    params = list(model.parameters())
    optimizer = torch.optim.Adam(params, lr = args.learning_rate)

    encode_ab_layer = NNEncLayer()   

    #######################
    ### Train the model ###
    #######################

    total_step = len(data_loader)
    
    for epoch in range(args.num_epochs):
        for i, (images, img_ab, filename) in enumerate(data_loader):
            #print(filename)
            images = images.unsqueeze(1).float().cuda()
            img_ab = img_ab.float()
                        
            encode_ab, max_encode_ab = encode_ab_layer.forward(img_ab)
            encode_ab = torch.from_numpy(encode_ab).long().cuda()

            targets=torch.Tensor(max_encode_ab).long().cuda()
            print(images.shape)
            outputs = model(images)
            
            loss=criterion(outputs,targets)
            
            model.zero_grad()
            
            loss.backward()
            optimizer.step()

            # Print log info
            if i % args.log_step == 0:
                print('Epoch [{}/{}], Step [{}/{}], Loss: {:.4f}'.format(epoch, args.num_epochs, i, total_step, loss.item()))

            # Save the model checkpoints

            if epoch == 4:
                torch.save(model.state_dict(), os.path.join(args.model_path, 'model-{}-{}.ckpt'.format(epoch + 1, i + 1)))

            
            
            

if __name__ == '__main__':
    parser = argparse.ArgumentParser()
    parser.add_argument('--image_dir', type = str, default = 'C:\\Users\\Matheus Santos\\Desktop\\Colorization-Epiphqny\\Dataset\\train\\images', help = 'directory for resized images')
    parser.add_argument('--model_path', type = str, default = 'C:\\Users\\Matheus Santos\\Desktop\\Colorization-Epiphqny\\models', help = 'path for saving trained models')
    parser.add_argument('--crop_size', type = int, default = 224, help = 'size for randomly cropping images')
    parser.add_argument('--log_step', type = int, default = 1, help = 'step size for prining log info')
    parser.add_argument('--save_step', type = int, default = 5, help = 'step size for saving trained models')

    # Model parameters
    parser.add_argument('--num_epochs', type = int, default = 5)
    parser.add_argument('--batch_size', type = int, default = 10)
    parser.add_argument('--num_workers', type = int, default = 8)
    parser.add_argument('--learning_rate', type = float, default = 1e-3)
    args = parser.parse_args()
    print(args)
    main(args)

Any doubts about the code, ask me please :smiley:

I think the error is for the rgb2lab command, right? I think you need to pass the image in rgb format not lab.

I need to pass the image in CIE Lab color space because the input of the CNN is the L channel. Due to this, I convert from RGB to Lab.

Could this problem be from another reason?

@Isaac_Kargar
Do you have something suggestion ?

No unfortunately. That’s what came to my mind because I have seen that before. It’s kind of hard to find the problem without debugging.

oh I see, that’s ok!
Thanks for the help :smiley:

@ptrblck and @futscdav do you have any ideia what could be this issue?

Which line of code is raising the first error?
Could you post the complete stack trace here, please?

Sure! I will put here the code of the archtecture of the model too.

####################################
############ MODEL ################
####################################
import torch
import torch.nn as nn
from torchvision import models

class ScaleLayer(nn.Module):

   def __init__(self, init_value=1e-3):
       super().__init__()
       self.scale = nn.Parameter(torch.FloatTensor([init_value]))

   def forward(self, input):
       return input * self.scale

def weights_init(model):
    if type(model) in [nn.Conv2d, nn.Linear]:
        nn.init.xavier_normal_(model.weight.data)
        nn.init.constant_(model.bias.data, 0.1)

class Color_model(nn.Module):
    def __init__(self):
        super(Color_model, self).__init__()
        self.features = nn.Sequential(
            # conv1
            nn.Conv2d(in_channels = 1, out_channels = 64, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 64, out_channels = 64, kernel_size = 3, stride = 2, padding = 1),
            nn.ReLU(),
            nn.BatchNorm2d(num_features = 64),
            # conv2
            nn.Conv2d(in_channels = 64, out_channels = 128, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 128, out_channels = 128, kernel_size = 3, stride = 2, padding = 1),
            nn.ReLU(),
            nn.BatchNorm2d(num_features = 128),
            # conv3
            nn.Conv2d(in_channels = 128, out_channels = 256, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 256, out_channels = 256, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 256, out_channels = 256, kernel_size = 3, stride = 2, padding = 1),
            nn.ReLU(),
            nn.BatchNorm2d(num_features = 256),
            # conv4
            nn.Conv2d(in_channels = 256, out_channels = 512, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 1),
            nn.ReLU(),
            nn.BatchNorm2d(num_features = 512),
            # conv5
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 2, dilation = 2),
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 2, dilation = 2),
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 2, dilation = 2),
            nn.ReLU(),
            nn.BatchNorm2d(num_features = 512),
            # conv6
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 2, dilation = 2),
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 2, dilation = 2),
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 2, dilation = 2),
            nn.ReLU(),
            nn.BatchNorm2d(num_features = 512),
            # conv7
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 1, dilation = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 1, dilation = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 512, out_channels = 512, kernel_size = 3, stride = 1, padding = 1, dilation = 1),
            nn.ReLU(),
            nn.BatchNorm2d(num_features = 512),
            # conv8
            nn.ConvTranspose2d(in_channels = 512, out_channels = 256, kernel_size = 4, stride = 2, padding = 1, dilation = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 256, out_channels = 256, kernel_size = 3, stride = 1, padding = 1, dilation = 1),
            nn.ReLU(),
            nn.Conv2d(in_channels = 256, out_channels = 256, kernel_size = 3, stride = 1, padding = 1, dilation = 1),
            nn.ReLU(),
            # conv8_313
            nn.Conv2d(in_channels = 256, out_channels = 313, kernel_size = 1, stride = 1,dilation = 1),
            nn.ReLU(),            
	        # decoding
            #nn.Conv2d(in_channels = 313, out_channels = 2, kernel_size = 1, stride = 1)
        )
        self.apply(weights_init)

    def forward(self, gray_image):
        features=self.features(gray_image)
        return features

I executed the train with these parameters:

# Model parameters
    parser.add_argument('--num_epochs', type = int, default = 5)
    parser.add_argument('--batch_size', type = int, default = 16)
    parser.add_argument('--num_workers', type = int, default = 8)
    parser.add_argument('--learning_rate', type = float, default = 1e-3)
    args = parser.parse_args()
    print(args)
    main(args)

The error is this one:

Traceback (most recent call last):
  File "c:/Users/Matheus Santos/Desktop/Colorization-Epiphqny/code/train.py", line 102, in <module>
    main(args)
  File "c:/Users/Matheus Santos/Desktop/Colorization-Epiphqny/code/train.py", line 44, in main
    for i, (images, img_ab, filename) in enumerate(data_loader):
  File "C:\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 801, in __next__
    return self._process_data(data)
  File "C:\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 846, in _process_data
    data.reraise()
  File "C:\Python37\lib\site-packages\torch\_utils.py", line 369, in reraise
    raise self.exc_type(msg)
ValueError: Caught ValueError in DataLoader worker process 2.
Original Traceback (most recent call last):
  File "C:\Python37\lib\site-packages\torch\utils\data\_utils\worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Python37\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Python37\lib\site-packages\torch\utils\data\_utils\fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "c:\Users\Matheus Santos\Desktop\Colorization-Epiphqny\code\data_loader.py", line 44, in __getitem__
    img_lab_ori = rgb2lab(img_rgb_ori) # Convert to CIE Lab color space
  File "C:\Python37\lib\site-packages\skimage\color\colorconv.py", line 1038, in rgb2lab
    return xyz2lab(rgb2xyz(rgb), illuminant, observer)
  File "C:\Python37\lib\site-packages\skimage\color\colorconv.py", line 681, in rgb2xyz
    arr = _prepare_colorarray(rgb).copy()
  File "C:\Python37\lib\site-packages\skimage\color\colorconv.py", line 152, in _prepare_colorarray
    raise ValueError(msg)
ValueError: the input array must be have a shape == (.., ..,[ ..,] 3)), got (375, 500)

Most likely caused by having grayscale images in the dataset. Add .convert('RGB') after loading your PIL image in the dataloader.

1 Like

Make sense! I will try what you suggested and then I say here if it worked.

But the images that I am using are from ImageNet. Is there the possibility of exist grayscale images among the RGB images?

Yes, some images in ImageNet are grayscale.

Hey! I tested here your solution and it worked!!
Now the training step is running ok
Thanks!! :smiley:

Though considering the application, it would be strictly better to remove the grayscale images. It probably won’t have a huge effect.

1 Like

Yeah, I will try this later. I am testing with all imagens and doing the convertion that you suggested just to see how the CNN will perform. Later when I define the dataset, I will remove the grayscale images.