The Opacus example, train batch size vs sampling rate

ashkan_software · August 17, 2022, 4:17am

Hello,

Thanks for your question.

This post made me aware that in our Cifar-10 example need to be updated, to be similar to other examples we have where sample rate is not defined as an input. We do not require sample rate to be an explicit argument, and we like to infer it from the data loader. For example, see here (“It’s automatically inferred from the data loader”):

So I will go ahead and update the cifar-10 example to be similar to other examples, where batch_size is directly provided as an argument. Also, the main idea behind sample_rate * len(train_dataset) was this.

Regarding the other parameters and convergence, yes they are relevant. For example learning rate, etc.

And regarding your second post, yes, you can have a pre-defined batch size after I send the fix for the cifar-10 example.

Regarding your questions

If I want to have a predefined batch_size, then what is the appropriate sampling rate, then the privacy budget?
Furthermore if I have a predefined privacy budget as well, then what is the correct way for getting a right sampling rate?

You basically do not need sampling rate any more, and only batch size (see here)