Opacus framework for Multilabel multiclass classification problem

sabeekakhan7 · October 21, 2022, 9:03pm

Hello,

I am stuck with something I was hoping to get some help with. I am trying to train a differentially private multilabel multiclass model on the NIH Chest X-Ray dataset using the Opacus framework. I have tried different settings for the hyperparameters, i.e., max_grad_norm and noise_multiplier, but my model consistently suffers from a considerable accuracy loss irrespective of the hyperparameter settings. As a sanity test, I set the noise_multiplier to 0 and max_grad_norm to high values within the range of 10 to 1000000. With these settings, the model should ideally behave like a regular non-differentially private model, but that’s not the case. The AUROC score I get with these settings is ~0.5, whereas when I train the model without the privacy_engine() wrapper, like a regular CNN model, the AUROC score is ~0.8.

I am confused as to why there’s a huge accuracy drop when I set the settings of the differentially private model in a way that the model is not really differentially private ( e.g. noise_multiplier = 0, max_grad_norm = 1000).

I’d greatly appreciate it if anyone who has worked on this could help me with it.

P.S. Here’s the link of the baseline code I am using. To create an instance of a differentially private model, I simply call the privacy_engine() wrapper, keeping everything else as it is.

github.com

zoogzog/chexnet/blob/master/ChexnetTrainer.py

import os
import numpy as np
import time
import sys

import torch
import torch.nn as nn
import torch.backends.cudnn as cudnn
import torchvision
import torchvision.transforms as transforms
import torch.optim as optim
import torch.nn.functional as tfunc
from torch.utils.data import DataLoader
from torch.optim.lr_scheduler import ReduceLROnPlateau
import torch.nn.functional as func

from sklearn.metrics.ranking import roc_auc_score

from DensenetModels import DenseNet121
from DensenetModels import DenseNet169

This file has been truncated. show original

ffuuugor · October 24, 2022, 11:52am

Hey

I can see that DenseNet* models use BatchNorm2d layers - which means opacus won’t be able to work with them directly.
Presumably, you’re calling ModuleValidator.fix(densenet121) before passing the module to the PrivacyEngine. What this does is it replaces BatchNorm with GroupNorm layers, which might by itself affect the performance - you can in fact check if that’s the case to confirm.
There’s not much one can do here - BatchNorm is not compatible with DP-SGD (see here for explanation). One thing you can try is using InstanceNorm as a replacement:

m = ModuleValidator.fix(densenet121, replace_bn_with_in=True)