High dimension sparse numerical data normalization

Hi,

I have a about 20M features and 200k data points for My tensor is sparse and I want to normalize it with zero mean and one std.
My data is not image and it is a numeric data set. The values are in range of (0, 20,000) overall. If I normalize the data with zero mean and one std, I am afraid the zeros will be negative numbers (non-zero) and the sparsity of the data (or tensor) affected significantly.
If I normalize the data to (0,1) the zero elements would remain zero and the larger values scale for each feature separately. However, I am not sure about the effect of (0,1) normalization over z-score (zero mean and one std) normalization.
Please help me to understand the pros and cons of two methods and how to do them in PyTorch for sparse tensor?