GPU util has 0-100% fluctuation

Generally, I would recommend profiling your use case before starting to apply optimizations without knowing where the actual bottleneck is. The performance guide and this post might also be helpful.