Cpu_count returns amount of threads which is usually equal to 2 times amount of cpus. Anyway if you set a number bigger than the real the effect is as is set the max.
Though it seems there isn’t an agreement…and it also seems it depends on the # of gpus! What a nightmare. Is there any heuristic/rough number that always makes things better but doesn’t overload?
I think at this point I don’t care about being optimal, just making it run faster than in the main thread without running the risk of overdoing it.
Yes, this kind of situations never have an exact answer. Actually, when you follow the thread, you can see that everyone has got a different result using same configuration and even some of them has got error!
So the best approach is to make sure your model is ok defualt value which is 0, then if you have resource or time, you can play with different configurations to achieve your best. I have same problem too.