Trying to train a network using multiple CPUs but I am not sure if Adam optimizer can be used instead of SGD, should the optimizer be shared??
Trying to train a network using multiple CPUs but I am not sure if Adam optimizer can be used instead of SGD, should the optimizer be shared??