I have a question about the implementation of complex Layers, maybe its a duplicate, but havent found a similar.
In linear Layer the forward Implementation is x @ w.T + b, made with torch.addmm.
With complex data, weights and biases it could be formed to:
(x.real @ w.real.T + b.real) - (x.imag @ w.imag.T + b.imag) + 1j * (x.real @ w.imag.T + b.imag + x.imag @ w.real.T + b.real)
But with nn.Linear(input_shape, output_shape, dtype=torch.complex64) it is:
(x.real @ w.real.T + b.real) - (x.imag @ w.imag.T + 0) + 1j * (x.real @ w.imag.T + b.imag + x.imag @ w.real.T + 0)
Why is the bias treated as 0 in the two subterms?
Is this intendet and whats the purpose or benefit of it?
In most self-made complex pytorch code found online the upper version is used.
In other layers like nn.Conv2d it is the same thing
This just isn’t correct. Consider the case where everything, in
particular the bias, b, is purely real. Your result, as written, will
contain an incorrect imaginary term, 1j * b.real.
The bias, b, is a purely additive term, so the real component
of the result contains just b.real and the imaginary component
of the result contains just b.imag. I don’t understand why you
would expect b.real and b.imag to both show up in both the
real and imaginary components of the result.
I find this hard to believe. But if it is true, then most self-made
code found online is wrong.
thanks for your clarification. I see my error now. I must have found the exact only 2 or 3 githubs where its wrong and made an error in my thoughts. Of course the bias is just added after the mutliplication.
For interested (and my rehabilitation ) the top google results for complex pytorch githubs make it like this: