Replace default operator run time implementation with custom code to make use of DSP HW

I am trying to run a tflite model in an ARM based micro controller
I was able to generate a .o file and corresponding header files specific to a model

My micro controller also has a DSP engine which can do faster matrix multiplication and it is a custom HW which is not currently supported by GLOW.

How can I make a specific operator of my choice to compile my custom code instead of the default implementation.

For eg in a model every fully_connected layer I want to run my custom code instead of default fully_connected layer implementation, so that I can make use of the custom DSP HW present in my micro controller.