How and torch.split effect the training result?

Hi there,

I am training a network which is modified based on Faster-RCNN. I tried to concat the feature extracted form backbone and split the new feature as follow:

feat_flatten_cat =, 2) # feat_flatten: list( [BN * channel * WH])
feat_flatten = torch.split(feat_flatten_cat, [feat.shape[-1] for feat in feat_flatten], dim=2)

I found that the training results were totally different from the case that I comment these two lines of code, the training is very hard to converge and the accuracy is close to 0. Can anyone explain this?

Appreciate any reply.

By the way, for both case, ‘feat_flatten’ is equal, which has been checked by tensor.equal()