Is there an alternative to do batched matrix multiplication on Quantized Tensors?

currently we only support quint8 for activations and qint8 for weight I think.

Currently we do not have plans for supporting bmm, one workaround is to put DeQuantStub and QuantStub around bmm op to skip quantizing it.