Model deployment process to FPGA

dalseeroh · November 4, 2021, 7:01pm

Hello folks,

I wrote a UNet-based model that inpaints the corrupted pixels in an input image. It is written in Python using PyTorch frameworks. It is relatively huge network, so the inference time is 200ms/image on CPU and 80ms/image on GPU. Now I want to deploy this model on Intel FPGA in the embedded products run by ARM core. The reason to do this is:

To improve this inference time
To save computing power at the end user

I am still investigating how to get this done. I want to avoid re-writing the model from the ground using HDL on FPGA. Has anyone done things like this before?

marksaroufim · February 11, 2022, 9:23pm

I haven’t tried it but you could convert your model to ONNX before deploying to FPGA Deploy ML models to FPGAs - Azure Machine Learning | Microsoft Docs

But in general there’s easier ways of increasing inference time than using FPGA Inference extremely slow on AWS (CPU)? - #2 by marksaroufim

Jon_runs · December 13, 2023, 8:51pm

I second this quite relevant question. Also, there is a limit to improving inference speeds on CPUs and GPUs, while the speed limit for FPGAs seems to be orders of magnitude lower.