Dimensionality mismatch help

Hey Guys,

I used Timm pre-trained efficientNet to train on a dataset. My test dataset has around 281 samples both 0 and 1 classes included.

Here is my model params:

<bound method Module.parameters of EfficientNet(
  (conv_stem): Conv2d(3, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
  (bn1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (act1): SiLU(inplace=True)
  (blocks): Sequential(
    (0): Sequential(
      (0): EdgeResidual(
        (conv_exp): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (act1): SiLU(inplace=True)
        (se): Identity()
        (conv_pwl): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1), bias=False)
        (bn2): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
     ................... You can see the full model architetcure here as I am reaching the character limit for the post:
https://colab.research.google.com/drive/1rWaRA-jErgxecbD3jiXTeiHpzLafsiwE#scrollTo=xW-RaHSEnYMT
)>

I wanted to take my samples with best predictions in terms of accuracy and apply LIME on it. So, I wanted to produce individual predictions on each file. Then take the files with highest accuracy and apply LIME on it.

But when I try this I get a dimensionality error. Anyways to get across this with a workaround?

Here is the code I am getting the error on :slight_smile:

list = testloader.dataset.samples
import PIL
for i in range(len(testloader.dataset.samples)):
  sample_fname, _ = testloader.dataset.samples[i]
  print(sample_fname)
  img = sample_fname
  image = PIL. Image. open(img)
  model = efficientnetv2_model
  transform = transforms.Compose([transforms.ToTensor()])
  tensor = transform(image)
  print(tensor.shape)
  logits = model(tensor)

This is the error. My images are black and white images:
—> 12 logits = model(tensor)
RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[1, 1, 128, 170] to have 3 channels, but got 1 channels instead

type or paste code here

Might be wrong, but the fact that they are black and white makes me suspect they are single channel images which is what the 1 channel message would seem to imply as well. Since you’re using a pretrained model that expects 3 channels, you could try either padding or expanding the tensor. Can try something like img_tensor = img_tensor.expand(-1, 3, -1, -1)

Hey thanks for the reply. What I don’t understand is the images always have been black and white. Am i missing something?

This is the original training file:

https://github.com/mvadrev/coviScan/blob/main/An_accelerated_COVID19_diagnosis_using_interepretable_deep_learning.ipynb

Looks like the colab is using ImageFolder for loading training data, and a search on stackoverflow seems to imply that this automatically converts grayscale images to RGB which would explain why loading them yourself would only have a single channel.

Gotcha so u saying using imageFolder auto converts gray scale to color by default. I will try to use imageFolder. Thanks

While it would probably work, you can just add expand like in the following which will be easier.

list = testloader.dataset.samples
import PIL
for i in range(len(testloader.dataset.samples)):
  sample_fname, _ = testloader.dataset.samples[i]
  print(sample_fname)
  img = sample_fname
  image = PIL. Image. open(img)
  model = efficientnetv2_model
  transform = transforms.Compose([transforms.ToTensor()])
  tensor = transform(image).expand(-1, 3, -1, -1)
  print(tensor.shape)
  logits = model(tensor)

Might just need .expand(3, -1, -1) not sure if that would have a batch dimension or not

I think this may work. Correct me if I am wrong. The idea I had was to simply duplicate the same sample across the batch of 32 so that I keep all the conditions of the training same during testing. So, I should probably have the same tensor over a batch of 32.

Yup. I haven’t used transforms.Compose before, but it looks like transforms.ToTensor() will return a tensor of shape CxHxW. So you’d want to use tensor = transform(image).unsqueeze(0).expand(32, 3, -1, -1) for the situation you’re describing if I’m understanding you correctly.

Thanks for your help. Here is what I did but getting a new error

list = testloader.dataset.samples
import PIL
for i in range(len(testloader.dataset.samples)):
  sample_fname, _ = testloader.dataset.samples[i]
  print(sample_fname)
  img = sample_fname
  image = PIL. Image. open(img)
  model = efficientnetv2_model
  transform = transforms.Compose([transforms.ToTensor()])
  tensor = transform(image).unsqueeze(0).expand(32, 3, -1, -1) 
  print(tensor.shape)
  logits = model(tensor)

→ 12 logits = model(tensor)

5 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias)
442 _pair(0), self.dilation, self.groups)
443 return F.conv2d(input, weight, bias, self.stride,
→ 444 self.padding, self.dilation, self.groups)
445
446 def forward(self, input: Tensor) → Tensor:

RuntimeError: Input type (torch.FloatTensor) and weight type (torch.cuda.FloatTensor) should be the same or input should be a MKLDNN tensor and weight is a dense tensor

type or paste code here

So, what’s I need is to find the accuracy of a single sample instead of a batch. So, that I can pick my input samples for best accuracy for further analysis.

Anyway we can run the accuracy function on single sample? One idea I have is to create a folder for each image. > Copy the same image 32 times in the folder and then try to run the accuracy function.

ANy other ideas ?

It works. I forgot to move the Tensor to GPU :stuck_out_tongue:
Thanks for your help @carperbr Saved my day

Ok there is one mroe thing I wanted to ask. So, I am running my accuracy tests for each file for each class seperate. So, the labels will be same for every run i.e since I am running the code on positive folder only all labels are positive.

But here is the issue when I run the following

list = testloader.dataset.samples
import PIL
for i in range(len(testloader.dataset.samples)):
  sample_fname, _ = testloader.dataset.samples[i]
  print(sample_fname)
  img = sample_fname
  image = PIL. Image. open(img)
  model = efficientnetv2_model
  transform = transforms.Compose([transforms.ToTensor()])
  tensor = transform(image).unsqueeze(0).expand(32, 3, -1, -1).to(device)
  print(tensor.shape)
  logits = model(tensor)
  y_pred = F.softmax(logits,dim = 1)
  top_p,top_class = y_pred.topk(1,dim = 1)
  print("THe top", top_class, top_p)
  loss = criterion(logits,labels)
  print("The loss is", loss.item())
  print("The accuracy is", accuracy(logits,labels))

Here is the error I am getting

--> 16   loss = criterion(logits,labels)
     17   print("The loss is", loss.item())
     18   print("The accuracy is", accuracy(logits,labels))

2 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   2994     if size_average is not None or reduce is not None:
   2995         reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2996     return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)
   2997 
   2998 

TypeError: cross_entropy_loss(): argument 'target' (position 2) must be Tensor, not list

It looks like you need to convert your labels object into a tensor; from the message, it appears that it is a list currently.

Thanks. I got it working but I have a question. So, I am now able to do the following and send each image one by one. I used your suggestion to unsqueeze the tensor for the image. But how do we do the dimensionality correction for the label?

In my code below if I pass the newlabels instead of labels, the accuracy function works. But I need to give label[i] because I need to pass the correct label of the image only. Any idea how I can handle this?

# checking all labels as below zp.zeros of 32 is not correct
for j, (images, labels) in enumerate(testloader):
      print("Iterating file",j)
      print("The len of lables", len(labels))
      # print("The labels is", labels[j])
      images = images.to(device)
      print(labels)
      labels = torch.tensor(labels.to(device='cuda'))
      # sample_fname, _ = testloader.dataset.samples[j]
      # print(sample_fname)
      # print("Types of images", images[j])
      # print("$$$$$size$$$$$$$", images.shape[0])
      

      for i in range(images.shape[0]):
        # print("count", i)
        sample_fname, _ = testloader.dataset.samples[i]
      # Forward pass - compute outputs on input data using the model
        logits = efficientnetv2_model(images[i].unsqueeze(0).expand(32, 3, -1, -1).to(device='cuda'))

        y_pred = F.softmax(logits,dim = 1)
        top_p,top_class = y_pred.topk(1,dim = 1)
        y_pred_list.append(top_class.data.to("cpu").tolist())
        

        print("labels ********", labels[i])
        newLabels = torch.tensor([1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]).to(device='cuda')
        loss = criterion(logits,newLabels)
        print("The loss is", loss.item())
        print("The accuracy is", accuracy(logits,newLabels))
      print("####### End of batch #######")

I ended up doing this. I am using np.full to fill all 32 values with the same class. But my file names are not matching. I need to pick my files with highest accuracy. But its printing it incorrectly:

# checking all labels as below zp.zeros of 32 is not correct
for j, (images, labels) in enumerate(testloader):
      print("Iterating file",j)
      print("The len of lables", len(labels))
      # print("The labels is", labels[j])
      images = images.to(device)
      print("labels",labels.shape)
      labels = torch.tensor(labels.to(device='cuda'))
      # sample_fname, _ = testloader.dataset.samples[j]
      # print(sample_fname)     

      for i in range(images.shape[0]):
        sample_fname, _ = testloader.dataset.samples[j]
        print(sample_fname)
        sample_fname, _ = testloader.dataset.samples[i]
      # Forward pass - compute outputs on input data using the model
        logits = efficientnetv2_model(images[i].unsqueeze(0).expand(32, 3, -1, -1).to(device='cuda'))

        y_pred = F.softmax(logits,dim = 1)
        top_p,top_class = y_pred.topk(1,dim = 1)
        y_pred_list.append(top_class.data.to("cpu").tolist())
        

        print("labels ********", labels[i].cpu().detach().numpy())
        # newLabels = torch.tensor([1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]).to(device='cuda')
        newlabels = torch.tensor(np.full(32,labels[i].cpu().detach().numpy())).to(device='cuda')
        loss = criterion(logits,newlabels)
        print("The loss is", loss.item())
        print("The accuracy is", accuracy(logits,newlabels))
      print("####### End of batch #######")

** Closing this as the issue is no longer about dimensionality **