When I use the optimizer, there is no gradient due to the use of unit8, but I have to use unit8

def apply_relighting_tensor(tensor, alpha, beta):
tensor = tensor * 255.0
new_tensor = tensor.to(torch.uint8)
new_tensor = new_tensor * alpha + beta / 255.0
new_tensor = torch.abs(new_tensor)
new_tensor = new_tensor.to(torch.float32)
new_tensor = new_tensor / 255.0
return new_tensor

I know that this value is normal, but it has no gradient
But when I use floating point numbers, the effect of the generated image is not what I want at all. I don’t know what to do.

def apply_relighting_tensor(tensor, alpha, beta):
tensor_float = tensor * 255.0
new_tensor = tensor_float * alpha + beta

new_tensor = torch.clamp(new_tensor, 0, 255)
new_tensor = new_tensor / 255.0

Since you weren’t able to train the model using the uint8 approach, could it also be possible that the cause of the issue with generated image quality is the result of some other part of your model/training process rather than this particular function.

1 Like

I tried what you said, but in fact the final positioning is at the place I mentioned before. If this problem is solved, the code will also be solved.

Theoretically, I can use CV2 to implement it, but there will be no gradient in training. My process is in the training graph, so I have to use tensor. But using the actual tensor is not enough and must be converted to unit8, which gives me a headache.

The main issue is that you are using torch.clamp(new_tensor, 0, 255) and this is probably why you are getting very different outputs for int vs float operations. One workaround is using rounding operations.

import torch
import numpy as np

 # your int method
def apply_relighting_tensor_int(tensor, alpha, beta):
    tensor = tensor * 255.0
    new_tensor = tensor.to(torch.uint8)
    new_tensor = new_tensor * alpha + beta / 255.
    new_tensor = torch.abs(new_tensor)
    new_tensor = new_tensor.to(torch.float32)
    new_tensor = new_tensor / 255.0
    return new_tensor

# new float method
def apply_relighting_tensor_float(tensor, alpha, beta):
    tensor_float = tensor * 255.0
    new_tensor = torch.round(tensor_float * alpha + beta, decimals=1)
    new_tensor = new_tensor / 255.0
    return torch.round(new_tensor, decimals=1)


if __name__ == "__main__":
    image_tensor = torch.rand(1, 3, 64, 64)
    # random alpha and beta
    alpha = 2.5
    beta = 0.1

    # apply functions
    int_tensor = apply_relighting_tensor_int(image_tensor, alpha=alpha, beta=beta)
    float_tensor = apply_relighting_tensor_float(image_tensor, alpha=alpha, beta=beta)

    # check stats
    print(f"INT VS FLOAT MAX: {int_tensor.max()} vs {float_tensor.max()}")
    print(f"INT VS FLOAT MIN: {int_tensor.min()} vs {float_tensor.min()}")
    print(f"INT VS FLOAT MEAN: {int_tensor.mean()} vs {float_tensor.mean()}")
    print(f"INT VS FLOAT STD: {int_tensor.std()} vs {float_tensor.std()}")

Though it’s not exactly giving the int8 outputs, the stats show that the float method is almost same as the int method.

INT VS FLOAT MAX: 2.4901974201202393 vs 2.5
INT VS FLOAT MIN: 1.537870048196055e-06 vs 0.0
INT VS FLOAT MEAN: 1.2464064359664917 vs 1.2522379159927368
INT VS FLOAT STD: 0.7196686863899231 vs 0.7208324670791626

This apply_relighting_tensor_float method shouldn’t have any issues with the gradient flow.

1 Like

Thank you very much for your reply. The round operation may affect the gradient.
Since the round operation is not differentiable or has zero gradient at most points, if rounding is used in the neural network or optimization process:

Gradient blocking: This operation causes the gradient on the relevant path to be zero, thereby blocking the gradient flow. This may cause the model to be unable to learn effectively.
Optimization difficulties: Using non-differentiable operations such as round makes the optimization process more difficult because the gradient information is lost or inaccurate.

The image after apply will be sent to the detector for detection, so I can’t let it break the gradient. I’ll try it first. If it still doesn’t work, I’ll come back and tell you.

This is done using the apply_relighting_tensor_float method, which currently does not display properly.


The following is done with unit8, it shows correctly, I tried many ways but still can’t do it

Please help me! because this is significant!