If i normalize the data from (-1,1) to (0,1) before loss computation, Is there any effect on backpropagation?

Yes, I would assume changing the value range before the loss calculation would also change the loss itself, which would then yield different gradients:

```
model = nn.Linear(10, 10)
x = torch.randn(1, 10)
target = torch.randn(1, 10)
criterion = nn.MSELoss()
# default
out = model(x)
loss = criterion(out, target)
loss.backward()
print(model.weight.grad.abs().sum())
> tensor(37.9178)
# scaled approach
scale = True
model.zero_grad()
out = model(x)
if scale:
out = out - out.min()
out = out / out.max()
print(out.min(), out.max())
> tensor(0., grad_fn=<MinBackward1>) tensor(1., grad_fn=<MaxBackward1>)
loss = criterion(out, target)
loss.backward()
print(model.weight.grad.abs().sum())
> tensor(11.1220)
```

1 Like