Clarification - Using backward() on non-scalars

Thanks to @jsm, @jdhao. It clarified the cloud away.

But I still think the naming of grad_variables is a bit misleading. Something like grad_weighting would be more intuitive.