Hi Lucas!
A bit of an update: So far, I have only been able to reproduce your issue
using your fishy
tensor on my gpu. I’ve tried various schemes to reproduce
it using random data (and lots of trials) without success.
I would say that this is legitimately a bug (but maybe a minor edge case).
I think your logging a github issue would be appropriate. If you do, be sure
to attach your fishy
tensor so that it can be reproduced.
Based on my (failed) experiments, I am less convinced by my original
theory of unlikely, but straightforward round-off error (although that
theory could still be true).
If it is round-off error, I would say that there is a (minor) bug in
simplex.check()
. Perhaps simplex.check()
should widen out
its tolerance based on the length of the rows of the probs
tensor.
But if straightforward round-off error doesn’t explain the issue, this would
hint at a possible cuda bug – not that I have any idea what it could be.
Is there any chance that you could capture another distinct fishy
tensor?
If it’s not just round-off error, having more examples that reproduce it would
likely be a big help in tracking down what is going on.
@ptrblck: This seems to me to be a weird one – would you want to take
a look?
Best.
K. Frank