FwFM Quantization

pintonos · August 26, 2020, 8:11am

I want to perform dynamic quantization on a FwFM model (paper). Unfortunately, i did not manage to make it work!

I get following error:

AttributeError: 'function' object has no attribute 't'

corresponding to the line in my forward function:

outer_fwfm = torch.einsum('klij,kl->klij', outer_fm,
                                              (self.field_cov.weight.t() + self.field_cov.weight) * 0.5)

Corresponding model layer before quantization:

(field_cov): Linear(in_features=39, out_features=39, bias=False)

After quantization:

(field_cov): DynamicQuantizedLinear(in_features=39, out_features=39, dtype=torch.qint8, qscheme=torch.per_tensor_affine)

Quantization line:

quantized_model = torch.quantization.quantize_dynamic(model, {torch.nn.Linear}, dtype=torch.qint8)

Am I missing something out here? Thanks for your help!
GitHub Code

ptrblck · August 27, 2020, 8:00am

It seems that DynamicQuantizedLinear replaces the weight attribute with a method:

lin = torch.nn.quantized.dynamic.Linear(1, 1)
print(lin.weight())

So you might need to call self.field_cov.weight().t() + self.field_cov.weight().

Note that, while this might work functionality-wise, I’m not familiar enough with your use case or the dynamic quantization to claim it’s the right approach to use when quantization is applied.

pintonos · August 27, 2020, 8:47am

Thanks!

Code now looks like this:

 if self.dynamic_quantization or self.static_quantization or self.quantization_aware:
                        q_func = QFunctional()
                        q_add = q_func.add(self.field_cov.weight().t(), self.field_cov.weight())
                        q_add_mul = q_func.mul_scalar(q_add, 0.5)
                        outer_fwfm = torch.einsum('klij,kl->klij', outer_fm, q_add_mul)

Error Traceback:

...
return _VF.einsum(equation, operands)
RuntimeError: Could not run 'aten::mul.Tensor' with arguments from the 'QuantizedCPU' backend. 'aten::mul.Tensor' is only available for these backends: [CPU, CUDA, MkldnnCPU, SparseCPU, SparseCUDA, Named, Autograd, Profiler, Tracer, Batched].

Can torch.einsum(...) be quantized? Would there be a workaround since it consists of mulitplication and addition?

ptrblck · August 28, 2020, 8:53am

Unfortunately, I’m not experienced enough using the quantization package, so we would need to wait for an expert.

Vasiliy_Kuznetsov · August 28, 2020, 4:44pm

hi @pintonos, currently we don’t have a quantized kernel for einsum, we would be happy to review a PR if someone is interested in implementing. In the meanwhile, a workaround could be to dequantize → floating point einsum → quantize.