I have a fundamental question here. Consider the objective function I have is F[G(b)], where F is function of matrix G, that F[G] = trace(inv(G)). G(b) is a function that use vector b as input. Now I’m using autograd function to compute numerical gradient of F over b, which is \Delta F /
Delta b
. In my configuration, vector b is double-type and objective function F generate result of a complex number, but the image part is extremely small that can be ignore (image part number is usually around 1e-20, which I think is computational error). Under this setting, when I get the numerical gradient of vector b, the gradient is a complex number. I would like to know if this is a possible situation or this is caused by some settings about autograd function.

Kevin

Hi Kevin!

Could you post a short, self-contained, runnable script that illustrates

Where in the process does the real b get turned into a complex F?

You mention that F is essentially real (very small imaginary part).
Mathematically speaking, should F be purely real?

Best.

K. Frank

The script I’m running is a MATLAB script that calls python function so it is a bit complicated to post it all. But the python function I’m running is here:

``````def F_Grad(Ch, CH, CW, F, b):
Ch_Ten = torch.from_numpy(Ch).to(torch.complex128)
b_Ten = torch.from_numpy(b).to(torch.complex128)
B_Ten = torch.diagflat(b_Ten).to(torch.complex128)
F_Ten = torch.from_numpy(F)
F_H_Ten = torch.transpose(torch.conj(F_Ten), 0, 1)
A_Ten = torch.matmul(B_Ten, F_Ten)
A_H_Ten = torch.transpose(torch.conj(A_Ten), 0, 1)
CH_Ten = torch.from_numpy(CH).to(torch.complex128)
CW_Ten = torch.from_numpy(CW).to(torch.complex128)
Ch_Tilde = torch.diagflat(torch.diagonal(torch.matmul(A_H_Ten, torch.matmul(CH_Ten, A_Ten)), 0))
G_Temp1 = torch.matmul(A_Ten, torch.matmul(Ch_Tilde, A_H_Ten))
G = torch.matmul(Ch_Tilde, torch.matmul(A_H_Ten, G_Temp2))
GA = torch.matmul(G, A_Ten)

F_OBJ1_Temp1 = torch.sub(torch.matmul(F_Ten, GA), F_Ten, alpha=1)
F_OBJ1_H_Temp1 = torch.transpose(torch.conj(F_OBJ1_Temp1), 0, 1)
F_OBJ1_Final = torch.trace(torch.matmul(F_OBJ1_Temp1, torch.matmul(Ch_Ten, F_OBJ1_H_Temp1)))

F_OBJ2_Temp1 = torch.matmul(F_Ten, G)
F_OBJ2_H_Temp1 = torch.transpose(torch.conj(F_OBJ2_Temp1), 0, 1)
F_OBJ2_Final = torch.trace(torch.matmul(F_OBJ2_Temp1, torch.matmul(CW_Ten, F_OBJ2_H_Temp1)))
F_OBJ_tensor = F_OBJ1_Final + F_OBJ2_Final
F_OBJ_tensor.backward()
F_OBJ = F_OBJ_tensor.detach().numpy()
``````

There are five input variables. Ch is a diagonal real matrix, CH is a symmetric complex matrix, CW is a diagonal real matrix, F is DFT matrix, and b is a real vector.

Mathematically objective function should be purely real because the physical meaning of it is the mean square error.

Hi Kevin!

Okay, that was opaque …

Does “DFT” mean “discrete Fourier transform?” Is F also complex?

If the objective function is mathematically purely real, I would try to
reorganize the calculation so that all the intermediate results are purely
real, as well. Doing so would prevent “mathematically zero” imaginary parts
from creeping in through round-off error.

To do so, you might have to express complex objects such as CH
explicitly in terms of their real and imaginary parts (e.g., CH_Real
and CH_Imag) so that everything would be `torch.float64` and
nothing would be `torch.complex128`.

Best.

K. Frank

Yes, F is discrete fourier transform matrix. Thanks for your suggestion by the way. I would also like to know from the top-level concept, is it true that as long as the objective function result is strictly real, the gradient of the function over a real vector should be real as well?

Best

Kevin