Relationship between flops and processing time

I perform tensor decomposition of convolution layers in serveral Models (CPM(convolutional Pose Machines) and CPN(Cascaded Pyramid Network)).

the number of parameters and flops are decreased by tensor decomposition.

but the test time of each model is not decreased.

is there any realtionship between flops and processing time ??