Is it possible to calculate MACs and Throughput per layer during the inference on GPU?

I am trying to find a tool for pytorch which supports the estimation of the MACs and Throughput per layer on GPU. I am interesting only for the convolution layers. Is any tool except thop?

There are a wide variety of user-developed tools for that you’re describing pytorch model flops github - Google Search

If you’re interested only in convolution layers, it would also be straightforward to e.g., create your own convolution layer that wraps the built-in convolution layer and does instrumentation for computing FLOPs/throughput in its own forward method.