I have a model trained on a merged datasets. I know mean and std of each dataset. Now I want to know how to should normalize new data for this model.
For I think we can calculate the average of means?
But for std since it could depend on covariance of them I don’t have any idea. Datasets two different dataset based on some labeled images.
If the relative sizes of the datasets are w_1 and w_2 (i.e. w_1 + w_2 == 1) and the means m_1, m_2, and stds s_1, s_2 (with a tiny bit of inaccurary if they are unbiased std, should not matter if you just merge two datasets, though), you have
(This computes the uncentered second moments from the stds, takes the weighted average to get that of the entire set, and then computes the std of the entire set from that.)
@tom thank you very much Thomas
I am a bit confused about how did you compute merged std. what do you mean by s_12 or m_22.
Are these joint std and mean of set 1 and 2 ? Or set 2 with itself?