Confused with `num_features` and `channels` in Batch1d

DataMiner · September 12, 2017, 3:41pm

I’m using convolution network for classification. My dataset is a typical 2D matrix,say, 100 samples x 10 features. one row represents a sample(e.g. a certain person’s information ), one column represents an attribute/feature (e.g. gender, name, weight, etc.).

When using nn.BatchNorm1d, there’s a num_features argument, also, it says that the input of nn.BatchNorm1d is (N, C) or (N, C, L). I guess C here means Channel.

Then here comes the problem,

unlike image data, there’s no natural channel in my dataset. So I can either treat my dataset as 100 samples with channels=10 and num_features=1, or, with channels=1 and num_features=10. I wonder which one is proper for using a BatchNorm1d? Why?
Furthermore, take one sample vector with 10 features for example, after passing a Conv1d(in_channel = 1, out_channel = 3) layer, I’ll get 3 vectors with 10 features. Then, should I treat them as Channel=3 and num_features=10?

I think this problem essentially lies in that what’s the difference between channels and features for input like images? RGB are usually called channels, but I think they can also be treated as features of one pixel. I’m not sure if I’m right.

smth · September 12, 2017, 4:15pm

it’s more natural to treat your dataset as 10 channels and 1 feature. This is because over the next few layers, convolutions will actually try to learn correlated transforms between data projections.

For BatchNorm, normalizing channel-wise is much more natural, because you dont want to normalize from one feature to another. (each feature can have different ranges, etc.).

DataMiner · September 13, 2017, 1:21am

Thx for reply @smth . So, to sum up,

for a m dimension 1-D vector input
1. if the input contains m attributes (e.g. a person’s weight, name, gender, etc.), it should be treated as m channels and one feature, and, do channel-wise batch-norm.
2. if the input contains 1 attribute but have m values (e.g. a piece of voice signal with m time steps ), then it is better to treat it as one channel and m features.
for a mxn 2-D matrix input
1. if the input contains mxn features (say, re-arrange a 1-D “person information” vector mentioned above into 2-D matrix ), likewise, we should treat the input as mxn channels and one feature
2. if the input contains 1 attribute(e.g. the “red” channel of an 2-D image), should be treated as 1 channel and mxn features

Feel free to correct me if I was wrong.