Hello everyone,
I am currently playing with some dummy data to perform deep learning tasks. I was checking the normalization of my data, which seems wrong.
Here is how my data is defined:
num_rows = 100
df = pd.DataFrame({
"time": range(num_rows),
"temperature": [100*(-1)**n for n in range(num_rows)],
"group": [1]*num_rows
})
Clearly, the temperature has a mean of 0 and a standard deviation (scale) of 100, however if I run the following
gn = GroupNormalizer(groups=["group"])
gn.fit(df["temperature"], df)
print(gn.norm_)
I end up with the following results:
center scale
group
1 0.0 100.504758
That’s why I was wondering what is going on.
Is the error coming from numerical approximations ? (but the error seems large in comparison of the number of rows)
Is it coming from the way GroupNormalizer computes the normalization ?
Am I missing something ?
Thanks for reading!