Numerically stable pre- and deprocessing: image slowly turning black

I’m building an application where repeated pre- and deprocessing is required. Small changes are made to the image between pre- and deprocessing, but even without these small changes, the following problem occurs, the image slowly clips to black.

I’ve tried to make the problem as simple as possible:

from PIL import Image
from torchvision import transforms
import torch

std_norm = torch.tensor([0.229, 0.224, 0.225])
mean_norm = torch.tensor([0.485, 0.456, 0.406])

preprocess = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean_norm, std_norm)])

def deprocess(img):
	img = img * std_norm.reshape((1, 3, 1, 1)) + mean_norm.reshape((1, 3, 1, 1))
	return img

import numpy as np
from PIL import Image

img = Image.open("your_image_here.png").convert('RGB')

img_np = np.float32(img)

for i in range(1000):
	img = preprocess(img)
	img = img.unsqueeze(0)
	img = deprocess(img).squeeze(0).permute(1,2,0).numpy()
	img = Image.fromarray((255.*img).astype(np.uint8))

	img.save("{0}.png".format(i))

The reason why I think it’s due to preprocessing and deprocessing is because if I remove the normalization from the transform composition and return img in deprocess the image retains its quality. Is there a way of getting more numerically stable results?

Hello,

I think the process of turning the image in uint8 format after each iteration of preprocess + deprocess may destruct some usefull information that can not be retrieved correctly during the next iteration. Can you try something like this:

from PIL import Image
from torchvision import transforms
import torch

std_norm = torch.tensor([0.229, 0.224, 0.225])
mean_norm = torch.tensor([0.485, 0.456, 0.406])

preprocess = transforms.Compose([transforms.Normalize(mean_norm, std_norm)])

def deprocess(img):
	img = img * std_norm.reshape((1, 3, 1, 1)) + mean_norm.reshape((1, 3, 1, 1))
	return img

import numpy as np
from PIL import Image

img = Image.open("your_image_here.png").convert('RGB')

img_np = np.float32(img)

img_data = transforms.ToTensor(img)

for i in range(1000):
	img_data = preprocess(img_data)
	img_data = img_data.unsqueeze(0)
	img_data = deprocess(img_data).squeeze(0).permute(1,2,0)
	img = Image.fromarray((255.*img_data).numpy().astype(np.uint8))

	img.save("{0}.png".format(i))

Or an other idea would be to, at least, round the result when converting it to int. When you do tensor.to(int) or numpy.astype(int), the value is floored, witch can actually be the source of your problem. So I suggest this alternative:

from PIL import Image
from torchvision import transforms
import torch
import numpy as np

std_norm = torch.tensor([0.229, 0.224, 0.225])
mean_norm = torch.tensor([0.485, 0.456, 0.406])

preprocess = transforms.Compose([transforms.ToTensor(), transforms.Normalize(mean_norm, std_norm)])

def deprocess(img):
	img = img * std_norm.reshape((1, 3, 1, 1)) + mean_norm.reshape((1, 3, 1, 1))
	return img

import numpy as np
from PIL import Image

img = Image.open("your_image_here.png").convert('RGB')

img_np = np.float32(img)

for i in range(1000):
	img = preprocess(img)
	img = img.unsqueeze(0)
	img = deprocess(img).squeeze(0).permute(1,2,0).numpy()
	img = Image.fromarray(np.round(255.*img).astype(np.uint8))

	img.save("{0}.png".format(i))

Hope that helps!
Thomas

1 Like