# Reconstruction of a vector by autoencoder (effect of input size and range)

Hello,

I have two questions about the Autoencoder based signal (a vector here, considering an FC-autoencoder) reconstruction. I would be really thankful if anyone helps me through this.

1: does reconstruction error depend on vector size? (for example reconstruction a dataset with signals of 3 dimension like, [x1,x2,x3] versus 4 dimensional inputs like [x1,x2,x3,x4])

1. can we say that reconstruction error depends on the mean of the inputs? set of inputs with higher values (like [10,10,10] would show higher reconstruction error versus lower ranged inputs ( like [1,1,1])? if yes is it recommended to normalize data beforehand?

Thank you so much

1. It depends how the loss is calculated. If you are using e.g. `nn.MSELoss` in the default setup, the loss value should not depend on the input feature dimension:
``````x = torch.randn(1, 3)
y = torch.randn(1, 3)
criterion = nn.MSELoss()
loss_small = criterion(x, y)

x = torch.randn(1, 3000)
y = torch.randn(1, 3000)
loss_large = criterion(x, y)

print(loss_small)
> tensor(1.5440)

print(loss_large)
> tensor(2.0731)
``````

However, you can of course use `reduction='sum'`, which would change it.

1. It depends again on your use case and the loss value will depend on the magnitude:
``````y = torch.tensor([[1.]])
rel_err = 1e-1
x = y - y * rel_err
loss_small = criterion(x, y)

y = torch.tensor([[100.]])
x = y - y * rel_err
loss_large = criterion(x, y)

print(loss_small)
> tensor(0.0100)

print(loss_large)
> tensor(100.)
``````

As you can see, the loss is much higher in the second use case even though the relative error is the same. I donâ€™t know how you are interpreting the loss, but it doesnâ€™t necessarily mean that the first use case is â€śbetterâ€ť than the second one.
That being said, normalizing the inputs often helps during the training so you might want to normalize the inputs anyway and could even â€śunnormalizeâ€ť the outputs, if necessary.

1 Like

Thank you so much @ptrblck. You are right, I think I should deeply consider and think about my loss_function.