How to Disable CUDA Fused Multiply-Add(FMA) on pytorch

Hi I used pytorch from 0.2 to 1.3 version, and love it.
recently, I upgrade CUDA10.2 and pytorch 1.3
I want to disable FMA on pytorch, but I can’t any information on the documents.
How can I disable the FMA?

1 Like


Could you share why you want to disable these?


I have an issue.

my scratch CNN(CPU-only) result and the result by FMA are different.

So, I want to turn off FMA.


In general, the floating point computation implemented on the GPU is different from the one on the CPU (even beyond FMA). So you won’t be able to get the same result (to one bit). But they both return the correct value as specified by the floating point standard.

I also know FMA rounding errors about GPU & CPU.
when cuda capability < 2.0, I implemented scratch CNN on python(with numpy) and perfectly equal with pytorch output.
when cuda capability >= 2.0, I tried to met the output using fma library in gcc, using ieee-754-2008 library, but I can’t.
precision between output of convolution in pytorch and (c-modeling, numpy) are different.
for example, max distance is 0.0000167 (pytorch vs c-model and numpy)
So, I want to disable FMA features, because it is critical issue to me.

I have an question,
When calculate convolution in pytorch -> cuda,
can you tell me correct data type?

  • fma((double)X *(double)Y + (double)B)
  • fma((float)X *(float)Y + (float)B)


The data type that is used is the one from the input Tensor. So that will depend on the type of your input.

I am still unsure how you identify that the FMA is the only cause of your issue.
The algorithms used on GPU (especially if you use cudnn) are wildly different from the CPU ones and the lack of associativity will break any hope of getting the same result no?

Hi, I have the same question. Did you find a way to disable fma on pytorch?