Your general approach is right and I also assumed than BatchNorm layers might be a problem in this case.
If you just have very few samples in each forward pass, you could use InstanceNorm
or GroupNorm
instead, which should work better for small batch sizes.
Alternatively, you could also try to change the momentum
of BatchNorm
, but I’m not sure, if that will really help a lot.
3 Likes