Hi Hello!
I don’t have an answer to your specific question, but let me offer some
speculation and context.
First, you should view this as “undefined behavior,” that is, anything could
happen. It is the user’s responsibility not to use duplicated indices in
cases such as these (e.g., assignment) where the duplicated indices
would need to be “resolved” somehow.
Note, apparently pytorch does not provide documentation for advanced
indexing, as lamented in this github issue.
The best I could find was this warning in the documentation for the related
case of scatter_():
Warning
When indices are not unique, the behavior is non-deterministic (one of the values from src will be picked arbitrarily) and the gradient will be incorrect (it will be propagated to all locations in the source that correspond to the same index)!
This is consistent with my experience, where you do get a valid value, but
without any guarantee as to which one. However, if the indexing (or, for that
matter, the scatter_()
) algorithm makes use of parallelism (multiple cpu
or gpu pipelines), then I could imagine full-bore undefined behavior where,
for example, there’s a race condition in writing to the target location and
you end up with a garbage value where some of the bytes (or words) of
the target value come from one source location and some from another.
Whether this is “merely” non-deterministic (as stated in the scatter_()
documentation) or fully undefined, you should view doing this as user
error. Even if you “always get a[0]=b[0]
,” you might no longer get that
same result if you change the sizes of your tensors or move from the cpu
to the gpu or move to a different model of gpu or upgrade to a new version
of pytorch.
I can’t point you to the code where this actually happens, but even if I could,
it wouldn’t matter, because pytorch is free to change that code as long as
the new version still works for the unique-indices case, even if it gives you
a different result when the indices are not unique.
Best.
K. Frank