I make a quite simple feed-forward network which consist of some fc/batchnorm/activation layer.
I then save the model, the *.pt or *.pth file is about 40K given by du model.pt -sh.
In deployment, I load the model just as recommended:
model = NN_model.NeuralNet()
Confusion here: Whether on CPU or CUDA, I noticed that the memory usage is about 1.5G on CPU memory given by free -h and ~900MB on GPU given by nvidia-smi.