help！The number of model parameters decreases, but the memory consumption increases

aauker · November 28, 2019, 6:10pm

The memory consumption of a model during training is actually mostly dominated by the size or number of feature maps, not the network weights (parameters). Replacing one convolution with two may reduce the parameters but roughly doubles the number feature maps. This is normal and doesn’t require a fix.