I can’t think of any compelling reason – both should work.
One thing to note is that the in the original version, the non-symmetric X matrix contains
ignored elements – those in the lower triangle – while your version contains redundant
elements (that aren’t ignored). Arguably the former could be better.
In your case, you add two elements together – one from the upper triangle and one from
the lower triangle. So if you (or the optimization step) adds a large value to the upper
element and subtracts the same value from the lower element, it would have no effect.
This means that you have a directions (a “mode”) where those elements could drift off
to +-inf, potentially causing problems. (The ignored values in the original approach
could also potentially drift off, but that’s less likely to happen and less likely to cause
problems if they do.)