Hello everyone,
I am trying to quantize the retinanet for QAT. Firstly I wanted to quantize only some parts of the network and only then the whole net. In order to save time, I am using the Detectron2, but I suppose this issue is related to pytorch.
First of all I tried to quantize RetinaNetHead (see the original one here - class RetinaNetHead: original retinanet in detectron2)
my implementation of RetinaNetHead based on the original one as in tutorial for quantization:
- Quant and Dequant Stubs. 2. corresponding forward
q_retinanet.py:
class Q_RetinaNetHead(nn.Module):
"""
The head used in RetinaNet for object classification and box regression.
It has two subnets for the two tasks, with a common structure but separate parameters.
"""
def __init__(self, cfg, input_shape: List[ShapeSpec]):
super().__init__()
# fmt: off
in_channels = input_shape[0].channels
num_classes = cfg.MODEL.RETINANET.NUM_CLASSES
num_convs = cfg.MODEL.RETINANET.NUM_CONVS
prior_prob = cfg.MODEL.RETINANET.PRIOR_PROB
num_anchors = build_anchor_generator(cfg, input_shape).num_cell_anchors
# fmt: on
assert (
len(set(num_anchors)) == 1
), "Using different number of anchors between levels is not currently supported!"
num_anchors = num_anchors[0]
cls_subnet = []
# cls_subnet.append(QuantStub())
bbox_subnet = []
for _ in range(num_convs):
cls_subnet.append(
nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=1, padding=1)
)
cls_subnet.append(nn.ReLU())
bbox_subnet.append(
nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=1, padding=1)
)
bbox_subnet.append(nn.ReLU())
# cls_subnet.append(DeQuantStub())
self.quant = QuantStub() # added line
self.cls_subnet = nn.Sequential(*cls_subnet)
# self.cls_dequant = DeQuantStub() #added line
self.bbox_subnet = nn.Sequential(*bbox_subnet)
self.cls_score = nn.Conv2d(
in_channels, num_anchors * num_classes, kernel_size=3, stride=1, padding=1
)
self.bbox_pred = nn.Conv2d(in_channels, num_anchors * 4, kernel_size=3, stride=1, padding=1)
self.dequant = DeQuantStub() # added line
# Initialization
for modules in [self.cls_subnet, self.bbox_subnet, self.cls_score, self.bbox_pred]:
for layer in modules.modules():
if isinstance(layer, nn.Conv2d):
torch.nn.init.normal_(layer.weight, mean=0, std=0.01)
torch.nn.init.constant_(layer.bias, 0)
# Use prior in model initialization to improve stability
bias_value = -(math.log((1 - prior_prob) / prior_prob))
torch.nn.init.constant_(self.cls_score.bias, bias_value)
def forward(self, features):
"""
Arguments:
features (list[Tensor]): FPN feature map tensors in high to low resolution.
Each tensor in the list correspond to different feature levels.
Returns:
logits (list[Tensor]): #lvl tensors, each has shape (N, AxK, Hi, Wi).
The tensor predicts the classification probability
at each spatial position for each of the A anchors and K object
classes.
bbox_reg (list[Tensor]): #lvl tensors, each has shape (N, Ax4, Hi, Wi).
The tensor predicts 4-vector (dx,dy,dw,dh) box
regression values for every anchor. These values are the
relative offset between the anchor and the ground truth box.
"""
logits = []
bbox_reg = []
for feature in features:
logits.append(
self.dequant(self.cls_score(self.cls_subnet(self.quant(feature))))) # added line: self,cls_quant()
bbox_reg.append(self.dequant(self.bbox_pred(self.bbox_subnet(self.quant(feature)))))
return logits, bbox_reg
- Fuse modules and configuration
train_net.py:
trainer.model.head.train()
trainer.model.head.qconfig = torch.quantization.get_default_qconfig('fbgemm')
modules_to_fuse = [['cls_subnet.0', 'cls_subnet.1'], ['cls_subnet.2', 'cls_subnet.3'], ['cls_subnet.4', 'cls_subnet.5'], ['cls_subnet.6', 'cls_subnet.7'],
['bbox_subnet.0', 'bbox_subnet.1'], ['bbox_subnet.2', 'bbox_subnet.3'], ['bbox_subnet.4', 'bbox_subnet.5'], ['bbox_subnet.6', 'bbox_subnet.7']]
torch.quantization.fuse_modules(trainer.model.head, modules_to_fuse, inplace=True)
torch.quantization.prepare_qat(trainer.model.head, inplace=True)
do_train(cfg, trainer)
trainer.model.head.eval()
print("Convert->")
torch.quantization.convert(trainer.model.head, inplace=True)
The training precess is done successfully, but the last line with convert give me an error:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
although I tried everything, even:
cuda = torch.device('cuda:0')
trainer.model.to(cuda)
I also checked all the tensors and they are in cuda (please see Q_RetinaNetHead file after training)
The entire architecture before training of the RetinaNet can be seen in (Q_RetinaNet file)
My questiona are:
- how to get rid of this error
- Am I right, that qat process was done successfully and conver is only like an export of the already trained model?
Best regard,
yayapa
files Q_RetinaNet and Q_RetinaNetHead can be found here as pdf