Concatenating images

This is for wsi medical images which is very large (5000 by 4000) and also has useless white parts.
And each wsi image has a label.
I divided these images into several patch and get only 16 useful patchs that have texture.
Now I want convert these 16 patch into a single photo.
labels are in a separate file with the name of each photo.

In that case each “large” image containing these 16 patches would have a single label.
Two of these large images would thus have two labels, so the shapes should be [batch_size=2, channels, 512, 512] for the input tensor and [batch_size=2] for the target tensor.

1 Like

Thankful
I don’t know how to use them.
So the Dataset class should be such that it gives 16 pictures per batch.
Then use the torch.cat() and view() functions in the epoch loop below?

for epoch in range(n):
for i, data in enumerate(trainloader):

You could use the same approach as e.g. transforms.TenCrop by creating the patches in the Dataset.__getitem__, return a 5-dimensional tensor containing the patches, and reshape it in the training loop.

1 Like

Thank you very much :pray: :rose:

Hi sir. I apologize
If our image labels have for example, 5 labels as e.g.10, 20, 35, 5, 40 or as e.g., a, b, c, d, e, and we cannot use the torchvision.datasets.ImageFolder library for classify, So how can data be classified?

The data are images and all in one folder

You could write a custom Dataset as explained here and add the data loading and target creation into the __getitem__ method.

1 Like

Thanks a lot

Should we assign labels to the numbers 0, 1, 2, 3, 4 or not needed And can our Dataset return the same labels?

for train the neural network.

If you want to use e.g. nn.CrossEntropyLoss or nn.NLLLoss, you should assign the labels to the range [0, nb_classes-1].

Yes, you can implement the target creation logic into the Dataset and make sure they are returned in the right range.

1 Like

Thankful. excellent.
But at the beginning of the train, I get this error when an color image 512 by 512 enters the model. I do not know where my problem is.
RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[1, 512, 512, 3] to have 3 channels, but got 512 channels instead

The nn.Conv2d layer expects the input in the channels-first (NCHW) format, while you are passing it as channels-last (NHWC).
Use x = x.permute(0, 3, 1, 2) to transform it to the right shape before passing it to the model.

1 Like

Oh. Yes
you are very good

Hi sir
excuse me, when I run the network without torchvision.transforms there is no problem , but when I run with the torchvision.transforms , I get this error:

TypeError: Caught TypeError in DataLoader worker process 0.
 File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 244, in __call__
    return F.resize(img, self.size, self.interpolation)
  File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 319, in resize
    raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
TypeError: img should be PIL Image. Got <class 'numpy.ndarray'>

I do not know how to solve this problem.

Part of my Dataset code:

def __getitem__(self, idx):
        file_name = self.df['image_id'].values[idx]
        file_path = f'../input/prostate-cancer-grade-assessment/train_images/{file_name}.tiff'
        images = skimage.io.MultiImage(file_path)[-1]
        images = tile(images)
        images = cv2.hconcat([cv2.vconcat([images[0], images[1], images[2], images[3]]), 
                             cv2.vconcat([images[4], images[5], images[6], images[7]]), 
                             cv2.vconcat([images[8], images[9], images[10], images[11]]), 
                             cv2.vconcat([images[12], images[13], images[14], images[15]])])

        images = cv2.cvtColor(images, cv2.COLOR_BGR2RGB)
        
        if self.transform is not None:
            #images = images.numpy() # Error !
            images = self.transform(images)
         

I think the problem is that, the images is an array, but it is from the tensor class !!! And I have to take it to the numpy class. But I do not know how!

Thanks so much for the guide

You would have to provide PIL.Images or PyTorch tensors to the transformation instead of numpy arrays as the error suggests.
Based on your code snippet you could transform the OpenCV image (numpy array) to a PIL.Image via the Image.fromarray method.

1 Like

Thanks a lot. It worked properly
Excuse me, the situation of my 10 epochs is as follows. In the end, Accuracy increases slowly.

### status fold: 1 ###
train_loss  valid_loss   Accuracy  kappa_score        time
0    1.452028    1.269394   49.54785     0.664385  419.344125
 1    1.319205    1.213655   52.52449     0.680870  400.685343
 2    1.105934    1.054436  58.251698     0.750475  399.620671
 3    1.040463    1.048039  59.118313     0.750323  401.917301
 4    0.998244    1.041330  59.419746     0.766310  401.042848
 5    0.973565    1.027888  59.495102     0.764129  400.287688
 6    0.945639    1.029338  59.570457     0.763214  404.441619
 7    0.924936    1.032565   59.60814     0.766563  399.797308
 8    0.903779    1.030024   59.79653     0.768021  400.552388
 9    0.877134    1.030672  59.834213     0.769995  404.820972

If I send you the my optimizer and lr_scheduler and criterion, can you guess what the problem is?
fold 2 Is as follows

1 - avg_train_loss: 0.9630  avg_val_loss:  0.7651 Accurecy:  71.14  kappa_score:  0.8399  
2 - avg_train_loss: 1.1018  avg_val_loss:  1.1973 Accurecy:  55.43  kappa_score:  0.6985
3 - avg_train_loss: 0.9377  avg_val_loss:  0.8634 Accurecy:  65.79  kappa_score:  0.8004
4 - avg_train_loss: 0.8576  avg_val_loss:  0.8488 Accurecy:  66.69 kappa_score:  0.8219
5 - avg_train_loss: 0.8269  avg_val_loss:  0.8343 Accurecy:  67.37 kappa_score:  0.8201
criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(transfer_model.parameters(), lr=init_lr/warmup_factor)
    scheduler_cosine = lr_scheduler.CosineAnnealingLR(optimizer, epochs-warmup_epo)
    scheduler = GradualWarmupScheduler(optimizer, multiplier=warmup_factor, 
                                       total_epoch=warmup_epo, after_scheduler=scheduler_cosine)

epochs = 10
init_lr = 3e-4
warmup_factor = 10
warmup_epo = 1
batch_size = 12

Guessing is a bit hard in this case, but I would try to increase the initial learning rate by e.g. a factor of 10 and check, if it would be beneficial for the training.
Is the accuracy still increasing after 10 epochs (but slowly) or is it stuck after a while?

1 Like

increase the initial learning rate? that’s mean set initial learning rate to 3e-3?

up to 14:

Epoch 1 - avg_train_loss: 0.4012  avg_val_loss: 0.3353 Accurecy:  50.98  
  Epoch 2 - avg_train_loss: 0.3532  avg_val_loss: 0.3312 Accurecy:  52.79  
  Epoch 3 - avg_train_loss: 0.3066  avg_val_loss: 0.2902 Accurecy:  58.48  
  Epoch 4 - avg_train_loss: 0.2915  avg_val_loss: 0.2888 Accurecy:  58.55  
  Epoch 5 - avg_train_loss: 0.2856  avg_val_loss: 0.2853 Accurecy:  59.31  
  Epoch 6 - avg_train_loss: 0.2777  avg_val_loss: 0.2810 Accurecy:  60.40  
  Epoch 7 - avg_train_loss: 0.2685  avg_val_loss: 0.2795 Accurecy:  60.55  
  Epoch 8 - avg_train_loss: 0.2631  avg_val_loss: 0.2814 Accurecy:  61.15  
  Epoch 9 - avg_train_loss: 0.2550  avg_val_loss: 0.2835 Accurecy:  60.85  
  Epoch 10 - avg_train_loss: 0.2473  avg_val_loss: 0.2869 Accurecy:  61.08  
  Epoch 11 - avg_train_loss: 0.2448  avg_val_loss: 0.2835 Accurecy:  61.23  
  Epoch 12 - avg_train_loss: 0.2390  avg_val_loss: 0.2847 Accurecy:  61.72  
  Epoch 13 - avg_train_loss: 0.2357  avg_val_loss: 0.2852 Accurecy:  61.15  
  Epoch 14 - avg_train_loss: 0.2351  avg_val_loss: 0.2855 Accurecy:  61.42

But when the next fold starts and the learning rate is initialized, there is a sudden jump in accuracy but it decreases again.
For example, in Fold 3 (According to the learning rate):


lr:  0.0000300
Epoch 1 - avg_train_loss: 0.2352  avg_val_loss: 0.1702 Accurecy:  78.45  

lr:  0.0003000
Epoch 2 - avg_train_loss: 0.2709  avg_val_loss: 0.2642 Accurecy:  65.67  

lr:  0.0000300
Epoch 3 - avg_train_loss: 0.2273  avg_val_loss: 0.1938 Accurecy:  75.06  

lr:  0.0000256
Epoch 4 - avg_train_loss: 0.2083  avg_val_loss: 0.1891 Accurecy:  76.11  

lr:  0.0000150
Epoch 5 - avg_train_loss: 0.1952  avg_val_loss: 0.1829 Accurecy:  76.71

In my opinion, what happens after each fold (initialization of the learning rate) should be done in the epochs loop. For example, it happens after every 5 epochs (the total epoch in each fold is equal to 15). But I do not know how.

Hello sir. Good time
excuse me.
Why does efficientnet input not need to be normalized?
Thank you for your help