Multiplication two tensor in pytorch

Toby · December 13, 2020, 6:31am

Hi,

Sorry for this kind of question, it is maybe because I’m too weak at linear algebra.

I am trying to implement new optimizer strategy. The author mentioned this formula.

fomular

Where gk is the gradient tensor and pk is the same shape tensor as gk. For example, if the gradient tensor has the shape (c,m,n) then its transpose tensor will have the shape is (n,m,c).

How can I do the multiplication between two tensors to get the scalar result? In that paper:

formula2

The author also told that pk different from 0 and the multiplication is smaller than 0.

InnovArul · December 13, 2020, 7:21am

I am not sure about this paper.
Typically, you can consider the parameters as an 1D array (vectorize the conv and fully connected layer parameters) and correspondingly the gradients will also be an 1D array.
Further, you can do p^T.g as normal inner product.
Would this work?

Toby · December 13, 2020, 7:52am

You mean I should flatten the gradient tensor?

Or for gk shape (c,m,n) and pk transpose shape is (c,n,m) then multiply to get the result tensor shape is (c, n, n) (equal zero if all elements are zero)

InnovArul · December 13, 2020, 8:15am

yes. To me, it seems that way.
But I am not sure about the context of this operation. You might know it better.

Toby · December 13, 2020, 8:27am

At first, I was about to flatten it, but I am not sure is it correct with what the author means or not, maybe I have to dig deeper into mathematics.

Anyway, thank you for your help and very fast answer.

Toby · December 15, 2020, 1:37am

Okay, flatten is the solution where they implement it in this repo