Is there some performance gain?
It’s more natural to me to have the batch size as the first dimension.
I guess in RNN at every time insantance what will be update is
Data(t+1,:,:)=f(Data(t,:,:))
So more natural seems to be their choice.
Is there some performance gain?
It’s more natural to me to have the batch size as the first dimension.
I guess in RNN at every time insantance what will be update is
Data(t+1,:,:)=f(Data(t,:,:))
So more natural seems to be their choice.