Hi, I am new to Pytorch and machine learning as well.
I found this tutorial http://pytorch.org/tutorials/beginner/nlp/advanced_tutorial.html on LSTM-CRF model very usefull.
But I presume the code is not very straightforward to be changed to make use of GPU computing, given default batch size 1 in the example. I’m wondering any good approach to make the forward (and Veritibi) algorithm to deal with batched input.
Thanks for the notice. Yeah, I mean the code is readable and very useful to understand what’s going on.
I tried to change forward algorithm like this:
def _forward_score(self, feats):
max_scores, max_ids = torch.max(vecs, 1)
max_scores_exp = max_scores.expand(vecs.size(0), vecs.size(0))
return max_scores + torch.log(torch.sum(torch.exp(vecs - max_scores_exp), 1))
init_alphas = torch.Tensor(1, self.tag_size).fill_(-10000.)
init_alphas[START_TAG] = 0.
init_variables = Variable(init_alphas)
def iter_forward(variables, feature_list):
if feature_list is None:
end_variables = variables + self.transitions[STOP_TAG].view(1, -1)
head_feat = feature_list
tail_feats = feature_list[1:]
tail_feats = None
head_feat_exp = head_feat.view(self.tag_size, 1).expand(self.tag_size, self.tag_size)
variables_exp = variables.expand(self.tag_size, self.tag_size)
next_tag_variables_exp = variables_exp + self.transitions + head_feat_exp
new_forward_variables = log_sum_exp(next_tag_variables_exp).view(1, self.tag_size)
return iter_forward(new_forward_variables, tail_feats)
return iter_forward_score(variables=init_variables, feature_list=feats)
But still found this is not a good idea for batch training. Any good suggestion ?
I have the same question is that
How can CRF be minibatch in pytorch？
I think one way to do it is by computing forward variables at each time step once for multiple tokens in a batch. Suppose batch size 1, we have sequence of length 3: w_11, w_12, w_13. For barch size of 2 we then have
w_11, w_12, w_13
w_21, w_22, w_23
The above code assumes batch size of 1 and already put computations in one iteration. I think we can add one dimension to that, however still need to iterate the time steps.