Glow bundle on small embedded systems: weights.bin too big

I’m implementing a de-noiser using deep learning.

Here’s the steps I followed:

  1. I have a pytorch model with 467k parameters (in float) that is 467*4 = 1.9MB big
  2. I converted it into ONNX format: size still 1.9MB
  3. I followed the AOT instructions and got a bundle for my arm-m33 architecture from the ONNX model.
  4. I checked the results of the inference on arm and they match with those of pytorch.
  5. I had a look at the model.weights.bin created in the bundle and it is 20MB, this is a problem for me

Unfortunately the client design is not going to have that kind of memory.

Is there any way to shrink the weights down? (ideally to a size close to the ONNX model)

Sorry for the delayed response – one way to understand what is going on here is to look at the output graph via --dump-graph-DAG=file.dot and then use dot to look over the Constant nodes in the graph to see if any were split or duplicated. This could happen if e.g. there are multiple users and constant folding occurred. It may be necessary to add some option to prevent duplication of weights to the graph optimizer, but not sure.