Correctly changing precision in DLRM

Hello,

I would like to change the precision used in DLRM inference. I observed that the default datatype used in the original code is “float32”, in the create_emb() method. Is changing the dtype to any of the supported values enough to correctly change the precision?

I believe the generated embedding table would be directly used in the apply_emb() function.

Thanks for the help.
Rishabh Jain

for some changes, like fp32 to fp16 or bf16 that is sufficient if those ops support those dtypes, for actual quanitzation you need to quantize your model which takes more effort, see: Quantization — PyTorch 2.2 documentation for more details