This is for wsi
medical images which is very large (5000 by 4000) and also has useless white parts.
And each wsi image has a label.
I divided these images into several patch and get only 16 useful patchs that have texture.
Now I want convert these 16 patch into a single photo.
labels are in a separate file with the name of each photo.
In that case each “large” image containing these 16 patches would have a single label.
Two of these large images would thus have two labels, so the shapes should be [batch_size=2, channels, 512, 512]
for the input tensor and [batch_size=2]
for the target tensor.
Thankful
I don’t know how to use them.
So the Dataset
class should be such that it gives 16 pictures per batch
.
Then use the torch.cat()
and view()
functions in the epoch loop below?
for epoch in range(n):
for i, data in enumerate(trainloader):
…
You could use the same approach as e.g. transforms.TenCrop
by creating the patches in the Dataset.__getitem__
, return a 5-dimensional tensor containing the patches, and reshape it in the training loop.
Thank you very much
Hi sir. I apologize
If our image labels have for example, 5 labels as e.g.10, 20, 35, 5, 40 or as e.g., a, b, c, d, e, and we cannot use the torchvision.datasets.ImageFolder
library for classify, So how can data be classified?
The data are images and all in one folder
You could write a custom Dataset
as explained here and add the data loading and target creation into the __getitem__
method.
Thanks a lot
Should we assign labels to the numbers 0, 1, 2, 3, 4 or not needed And can our Dataset
return the same labels?
for train the neural network.
If you want to use e.g. nn.CrossEntropyLoss
or nn.NLLLoss
, you should assign the labels to the range [0, nb_classes-1]
.
Yes, you can implement the target creation logic into the Dataset
and make sure they are returned in the right range.
Thankful. excellent.
But at the beginning of the train, I get this error when an color image 512 by 512 enters the model. I do not know where my problem is.
RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[1, 512, 512, 3] to have 3 channels, but got 512 channels instead
The nn.Conv2d
layer expects the input in the channels-first (NCHW) format, while you are passing it as channels-last (NHWC).
Use x = x.permute(0, 3, 1, 2)
to transform it to the right shape before passing it to the model.
Oh. Yes
you are very good
Hi sir
excuse me, when I run the network without torchvision.transforms
there is no problem , but when I run with the torchvision.transforms
, I get this error:
TypeError: Caught TypeError in DataLoader worker process 0.
File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 244, in __call__
return F.resize(img, self.size, self.interpolation)
File "/opt/conda/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 319, in resize
raise TypeError('img should be PIL Image. Got {}'.format(type(img)))
TypeError: img should be PIL Image. Got <class 'numpy.ndarray'>
I do not know how to solve this problem.
Part of my Dataset code:
def __getitem__(self, idx):
file_name = self.df['image_id'].values[idx]
file_path = f'../input/prostate-cancer-grade-assessment/train_images/{file_name}.tiff'
images = skimage.io.MultiImage(file_path)[-1]
images = tile(images)
images = cv2.hconcat([cv2.vconcat([images[0], images[1], images[2], images[3]]),
cv2.vconcat([images[4], images[5], images[6], images[7]]),
cv2.vconcat([images[8], images[9], images[10], images[11]]),
cv2.vconcat([images[12], images[13], images[14], images[15]])])
images = cv2.cvtColor(images, cv2.COLOR_BGR2RGB)
if self.transform is not None:
#images = images.numpy() # Error !
images = self.transform(images)
I think the problem is that, the images
is an array
, but it is from the tensor class
!!! And I have to take it to the numpy class
. But I do not know how!
Thanks so much for the guide
You would have to provide PIL.Image
s or PyTorch tensors to the transformation instead of numpy arrays as the error suggests.
Based on your code snippet you could transform the OpenCV image (numpy array) to a PIL.Image
via the Image.fromarray
method.
Thanks a lot. It worked properly
Excuse me, the situation of my 10 epochs is as follows. In the end, Accuracy increases slowly.
### status fold: 1 ###
train_loss valid_loss Accuracy kappa_score time
0 1.452028 1.269394 49.54785 0.664385 419.344125
1 1.319205 1.213655 52.52449 0.680870 400.685343
2 1.105934 1.054436 58.251698 0.750475 399.620671
3 1.040463 1.048039 59.118313 0.750323 401.917301
4 0.998244 1.041330 59.419746 0.766310 401.042848
5 0.973565 1.027888 59.495102 0.764129 400.287688
6 0.945639 1.029338 59.570457 0.763214 404.441619
7 0.924936 1.032565 59.60814 0.766563 399.797308
8 0.903779 1.030024 59.79653 0.768021 400.552388
9 0.877134 1.030672 59.834213 0.769995 404.820972
If I send you the my optimizer
and lr_scheduler
and criterion
, can you guess what the problem is?
fold 2 Is as follows
1 - avg_train_loss: 0.9630 avg_val_loss: 0.7651 Accurecy: 71.14 kappa_score: 0.8399
2 - avg_train_loss: 1.1018 avg_val_loss: 1.1973 Accurecy: 55.43 kappa_score: 0.6985
3 - avg_train_loss: 0.9377 avg_val_loss: 0.8634 Accurecy: 65.79 kappa_score: 0.8004
4 - avg_train_loss: 0.8576 avg_val_loss: 0.8488 Accurecy: 66.69 kappa_score: 0.8219
5 - avg_train_loss: 0.8269 avg_val_loss: 0.8343 Accurecy: 67.37 kappa_score: 0.8201
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(transfer_model.parameters(), lr=init_lr/warmup_factor)
scheduler_cosine = lr_scheduler.CosineAnnealingLR(optimizer, epochs-warmup_epo)
scheduler = GradualWarmupScheduler(optimizer, multiplier=warmup_factor,
total_epoch=warmup_epo, after_scheduler=scheduler_cosine)
epochs = 10
init_lr = 3e-4
warmup_factor = 10
warmup_epo = 1
batch_size = 12
Guessing is a bit hard in this case, but I would try to increase the initial learning rate by e.g. a factor of 10 and check, if it would be beneficial for the training.
Is the accuracy still increasing after 10 epochs (but slowly) or is it stuck after a while?
increase the initial learning rate? that’s mean set initial learning rate to 3e-3?
up to 14:
Epoch 1 - avg_train_loss: 0.4012 avg_val_loss: 0.3353 Accurecy: 50.98
Epoch 2 - avg_train_loss: 0.3532 avg_val_loss: 0.3312 Accurecy: 52.79
Epoch 3 - avg_train_loss: 0.3066 avg_val_loss: 0.2902 Accurecy: 58.48
Epoch 4 - avg_train_loss: 0.2915 avg_val_loss: 0.2888 Accurecy: 58.55
Epoch 5 - avg_train_loss: 0.2856 avg_val_loss: 0.2853 Accurecy: 59.31
Epoch 6 - avg_train_loss: 0.2777 avg_val_loss: 0.2810 Accurecy: 60.40
Epoch 7 - avg_train_loss: 0.2685 avg_val_loss: 0.2795 Accurecy: 60.55
Epoch 8 - avg_train_loss: 0.2631 avg_val_loss: 0.2814 Accurecy: 61.15
Epoch 9 - avg_train_loss: 0.2550 avg_val_loss: 0.2835 Accurecy: 60.85
Epoch 10 - avg_train_loss: 0.2473 avg_val_loss: 0.2869 Accurecy: 61.08
Epoch 11 - avg_train_loss: 0.2448 avg_val_loss: 0.2835 Accurecy: 61.23
Epoch 12 - avg_train_loss: 0.2390 avg_val_loss: 0.2847 Accurecy: 61.72
Epoch 13 - avg_train_loss: 0.2357 avg_val_loss: 0.2852 Accurecy: 61.15
Epoch 14 - avg_train_loss: 0.2351 avg_val_loss: 0.2855 Accurecy: 61.42
But when the next fold starts and the learning rate is initialized, there is a sudden jump in accuracy but it decreases again.
For example, in Fold 3 (According to the learning rate):
lr: 0.0000300
Epoch 1 - avg_train_loss: 0.2352 avg_val_loss: 0.1702 Accurecy: 78.45
lr: 0.0003000
Epoch 2 - avg_train_loss: 0.2709 avg_val_loss: 0.2642 Accurecy: 65.67
lr: 0.0000300
Epoch 3 - avg_train_loss: 0.2273 avg_val_loss: 0.1938 Accurecy: 75.06
lr: 0.0000256
Epoch 4 - avg_train_loss: 0.2083 avg_val_loss: 0.1891 Accurecy: 76.11
lr: 0.0000150
Epoch 5 - avg_train_loss: 0.1952 avg_val_loss: 0.1829 Accurecy: 76.71
In my opinion, what happens after each fold (initialization of the learning rate) should be done in the epochs loop. For example, it happens after every 5 epochs (the total epoch in each fold is equal to 15). But I do not know how.
Hello sir. Good time
excuse me.
Why does efficientnet input not need to be normalized?
Thank you for your help