RuntimeError: Could not run 'aten::mm' with arguments from the 'QuantizedCPU' backend. 'aten::mm' is only available for these backends

RuntimeError: Could not run ‘aten::mm’ with arguments from the ‘QuantizedCPU’ backend. ‘aten::mm’ is only available for these backends: [CPU, CUDA, SparseCPU, SparseCUDA, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivate Use2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

h0 = torch.matmul(input, self.W[0])

i was trying to apply quantization to a graph convolutional neural network and during evaluation of the network this operation was not permitted. what options do i have to do torch.matmul as a quantizable operation.

hi @naveen_raj , we currently do not have a quantized kernel for aten::mm. There are a couple of options here:

  1. you could leave this op in fp32 (this is going to be the fastest in terms of dev time)
  2. if someone is up for helping write the quantized kernel for aten::mm, we would accept a PR
1 Like