I think you have the answer to your question here: What is action.reinforce(r) doing actually?
I prefer your code than the one in the example, it is easier to understand what is done on the lines when the loss is well written, rather than using action.reinforce(reward)
which seems to me quite not intuitive…