I want to profile the layerwise time taken during inference on CPU. Is there any way to measure the time between the starting and ending point. Let us consider MobileNet, which has many Inverted Residual Blocks. I need to get the inference latency for each Block.
Thanks in Advance for any pointers/help.