Can not use save_image to save image after data augmentation

I’m beginner for Pytorch. I want to increase the number of datasets (data augmentation).
In this case, I have one image of cat and I want to use albumentations to increase the number of images and save image into another folder as follows:

import cv2
import torch
import albumentations as A
import numpy as np
import matplotlib.pyplot as plt
from torchvision.utils import save_image

I define an augmentation as follows:

LOADER_TRANSFORMED = A.Compose([
    A.HorizontalFlip(p=0.6),
    A.RandomBrightnessContrast(brightness_limit=1, contrast_limit=1, p=0.4)
])

I use OpenCV to read an image and convert it to the RGB colorspace.

IMAGE = cv2.imread('/content/drive/MyDrive/Test/cat.jpg') # default BGR
IMAGE = cv2.cvtColor(IMAGE, cv2.COLOR_BGR2RGB) # BGR -> RGB

Augment an image as follows:

TRANSFORMED = LOADER_TRANSFORMED(image=IMAGE) # dict
TRANSFORMED_IMAGE = TRANSFORMED["image"] # numpy.ndarray
TRANSFORMED_IMAGE_2 = LOADER_TRANSFORMED(image=IMAGE)["image"]

In this part, I want to save image after augmentation using save_image as follows:

step 1: convert numpy to tensor

TRAN_IMG_2 = torch.from_numpy(TRANSFORMED_IMAGE_2)

step 2: save_image

save_image(TRAN_IMG_2, '/content/drive/MyDrive/Test/Aug_cat')

I got the error as follows:

RuntimeError: result type Float can’t be cast to the desired output type Byte

I’m not sure, Where did the error occur? because in save_image require tensor as input.

@Jaturong maybe you need to specify the format of the image, like this

save_image(TRAN_IMG_2, '/content/drive/MyDrive/Test/Aug_cat.jpg')

@andrea.tantucci I have tried your method. But still got the same result.

Could you post the dtype of TRAN_IMG_2 as well as its min and max values and shape, please?

@ptrblck I got the dtype of TRAN_IMG_2 is <class 'torch.Tensor'>.
Min and Max values are tensor(93, dtype=torch.uint8) and tensor(255, dtype=torch.uint8) respectively.
And shape of tensor is torch.Size([666, 1000, 3])

Thank you for this information!
The docs seem to lack the information that normalized floating point tensors are expected, since internally the inputs will be “unnormalized” and cast to uitn8 as seen here.

Additionally, the channels-first memory layout is expected. This code should work:

x = torch.randint(93, 256, (666, 1000, 3)).to(torch.uint8)
print(x.shape)
# torch.Size([666, 1000, 3])
print(x.min(), x.max())
# tensor(93, dtype=torch.uint8) tensor(255, dtype=torch.uint8)

save_image(x, "tmp.jpeg")
# RuntimeError: result type Float can't be cast to the desired output type Byte

# normalize
y = x.float() / 255.
print(y.min(), y.max())
# tensor(0.3647) tensor(1.)

# wrong memory layout
save_image(y, "tmp.jpeg")
# TypeError: Cannot handle this data type: (1, 1, 666), |u1

y = y.permute(2, 0, 1)
save_image(y, "tmp.jpeg")
1 Like

@ptrblck Thank you very much. I can save image using above code.