Broadcasting? Or alternative solutions

Hi,

I really like the library so far. However, I was wondering if broadcasting is on the roadmap, and what would be your current suggestion for work-arounds?

For a toy example, say I want to train OLS linear regression model and want to compute the net input from a 2D dataset (w_1 * x_1 + w_2 * x_2 + bias) with 6 training instances (just to have a toy example for illustrative purposes) . The following wouldn’t work) since there’s no broadcasting when the bias is added (x.mm(weights) + bias)

x = Variable(torch.Tensor([[1.0, 1.0], 
                           [2.0, 2.0], 
                           [3.0, 3.0], 
                           [4.0, 4.0], 
                           [5.0, 5.0], 
                           [6.0, 6.0]]))

weights = Variable(torch.zeros(2, 1))
bias = Variable(torch.zeros(1))

net_input = x.mm(weights) + bias

A workaround would be to add 1s to the input tensor, I guess:

x = Variable(torch.Tensor([[1.0, 1.0, 1.0], 
                           [1.0, 2.0, 2.0], 
                           [1.0, 3.0, 3.0], 
                           [1.0, 4.0, 4.0], 
                           [1.0, 5.0, 5.0], 
                           [1.0, 6.0, 6.0]]))

weights = Variable(torch.zeros(3, 1))

net_input = x.mm(weights)

What would be your thoughts on that?

2 Likes

Adding broadcasting to most operations is definitely on our roadmap, and will be hopefully ready quite soon. Since it’s so often asked for we’ll probably reprioritize that and have it implemented soonish.

For now there are two solutions - one that works already, another one that will be implemented very soon (presented in the same order):

You can do broadcasting by manually adding singleton dimensions and expanding along them. This doesn’t do any memory copy, and only does some stride tricks:

net_input = x.mm(weights)
net_input += bias.unsqueeze(0).expand_as(net_input)

Another way, that’s only slightly more convenient is to add a .broadcast function that combines possible many unsqueezes and expands into a single call. You’d just pass in the desired size.


I realize that it might not be a suggestion for a tutorial, where you want to show all the ops, but if you want to do this in your code right now, I’d use linear function from torch.nn.functional:

import torch.nn.functional as F
output = F.linear(input, weights, bias) # bias is optional
7 Likes

That’s very helpful answer. Thanks a lot! :slight_smile:

expand on singleton dims is cool, but still want to know .broadcast not available yet right now?

I decided to add that functionality to .expand, instead of adding a new function. This should work now:

x = torch.randn(1)
y = torch.randn(2, 3, 4, 5, 6)
print(y + x.expand_as(y))
6 Likes

That’s very handy! You can also avoid that entirely and just use the broadcasting ops I provided directly, which are largely compatible with numpy / theano / keras / tensorflow: Tip: using keras compatible tensor dot product and broadcasting ops

1 Like

@jphoward actually it seems that the align function from your code could be replaced with expand_as, right?

@apaszke, aligning two tensors for broadcasting requires more than what expand_as provides. In particular, it’s not at all unusual to have each tensor have a unit axis in a different place, so each tensor needs to be expanded along different axes. expand_as does not do this, and the API doesn’t make it possible, since it has to be able to change both tensors and return them both.

Let me know if you need more info - I’m not sure I’ve done a great job of explaining!

Ah yeah, right. It actually needs to be a two way expand :slight_smile: Thanks