Custom Ensemble approach

ptrblck · March 1, 2020, 6:51am

Thanks for the code.
The error was raised by passing two inputs to MyEnsemble.
After fixing this issue, you’ll run into another error:

   x1 = x1.view(x1.size(0), -1)

AttributeError: 'tuple' object has no attribute 'view'

since your ResNet implementation returns a tuple as return x1,x2,x3.

mobassir94 · March 1, 2020, 6:52am

@ptrblck i changed num_features from 1024 to 2000 as you suggested above and now i get this new error :

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<timed exec> in <module>

/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

<ipython-input-25-89a17425554a> in forward(self, x)
     15 
     16     def forward(self, x):
---> 17         x1 = self.model(x.clone())  # clone to make sure x is not changed by inplace methods
     18         x1 = x1.view(x1.size(0), -1)
     19         x2 = self.model1(x)

<ipython-input-24-a25ae27f9f3e> in __call__(self, x)
      8         with torch.no_grad():
      9             for m in self.models:
---> 10                 res.append(m(x))
     11         res = torch.stack(res)
     12         #res = torch.cat(res)

/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    539             result = self._slow_forward(*input, **kwargs)
    540         else:
--> 541             result = self.forward(*input, **kwargs)
    542         for hook in self._forward_hooks.values():
    543             hook_result = hook(self, input, result)

<ipython-input-17-c19db774a403> in forward(self, x)
     15 
     16     def forward(self, x):
---> 17         x1 = self.model(x.clone())  # clone to make sure x is not changed by inplace methods
     18         x1 = x1.view(x1.size(0), -1)
     19         x2 = self.model1(x)

<ipython-input-16-a25ae27f9f3e> in __call__(self, x)
      9             for m in self.models:
     10                 res.append(m(x))
---> 11         res = torch.stack(res)
     12         #res = torch.cat(res)
     13         return torch.mean(res, dim=0)

TypeError: expected Tensor as element 0 in argument 0, but got tuple

i have 2 questions now :
question 1 : why num_features = 2000 for resnet34? and what if i add resnext50 in this stacking list then what would be num_features for that new ensemble?

question 2 : why do i got new error as shown above and how to solve that?

mobassir94 · March 1, 2020, 6:54am

@ptrblck got it,you are right,i got error :

expected Tensor as element 0 in argument 0, but got tuple

as you expected, how do i do ensemble now?

mobassir94 · March 1, 2020, 6:58am

@ptrblck for ensemble i tried this code too :

class Model:
    def __init__(self, models):
        self.models = models
    
    def __call__(self, x):
        res = []
        x = x.cuda()
        with torch.no_grad():
            for m in self.models:
                res.append(m(x))
        res = torch.stack(res)
        #res = torch.cat(res)
        return torch.mean(res, dim=0)

model = Model([model1,
               model
               ])

but i get the same error : TypeError: expected Tensor as element 0 in argument 0, but got tuple

seeking for your help

ptrblck · March 1, 2020, 7:15am

As explained above, your ResNet implementation returns a tuple:

        x1 = self.fc1(x)
        x2 = self.fc2(x)
        x3 = self.fc3(x)
        return x1,x2,x3

so you should either unpack it in your ensemble or return a single value.
I’m not sure what tour use case is, you it seems you need these three values.

mobassir94 · March 1, 2020, 7:22am

@ptrblck actually the code i am using is for ongoing kaggle competition ,this one : https://www.kaggle.com/c/bengaliai-cv19

as it is a multiclass classification problem i need it to return x1,x2,x3 for
grapheme_root,vowel_diacritic and consonant_diacritic

for test set prediction i use code like this :

row_id,target = [],[]
for fname in test_data:
    data = pd.read_parquet(f'/kaggle/input/bengaliai-cv19/{fname}')
    data = Resize(data)
    test_image = GraphemeDataset(data)
    dl = torch.utils.data.DataLoader(test_image,batch_size=128,num_workers=4,shuffle=False)
    with torch.no_grad():
        for x,y in dl:
            x = x.unsqueeze(1).float().cuda()
            p1,p2,p3 = model(x)
            p1 = p1.argmax(-1).view(-1).cpu()
            p2 = p2.argmax(-1).view(-1).cpu()
            p3 = p3.argmax(-1).view(-1).cpu()
            for idx,name in enumerate(y):
                row_id += [f'{name}_grapheme_root',f'{name}_vowel_diacritic',
                           f'{name}_consonant_diacritic']
                target += [p2[idx].item(),p1[idx].item(),p3[idx].item()]
sub_df = pd.DataFrame({'row_id': row_id, 'target': target})
sub_df.to_csv('submission.csv', index=False)
sub_df

i am not sure how to write code for ensemble for this task

ptrblck · March 1, 2020, 8:23am

You could store each output in a separate list (for each task), then call torch.stack and torch.mean on each “task tensor” separately and return again all three tensors.

mobassir94 · March 1, 2020, 8:27am

@ptrblck can you please share the modified code with me to give your solution a try? thanks

Hamada_Fathy · March 29, 2020, 1:02pm

could you please tell me how can i modify your ensemble code to make it work for me?
that’s my vgg_19_bn model summary.

[VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
(3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace)
(6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(7): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(9): ReLU(inplace)
(10): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(12): ReLU(inplace)
(13): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(14): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(16): ReLU(inplace)
(17): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(19): ReLU(inplace)
(20): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(22): ReLU(inplace)
(23): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(25): ReLU(inplace)
(26): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(27): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(28): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(29): ReLU(inplace)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(32): ReLU(inplace)
(33): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(34): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(35): ReLU(inplace)
(36): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(37): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(38): ReLU(inplace)
(39): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(40): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(41): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(42): ReLU(inplace)
(43): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(44): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(45): ReLU(inplace)
(46): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(47): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(48): ReLU(inplace)
(49): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(50): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(51): ReLU(inplace)
(52): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace)
(2): Dropout(p=0.5)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace)
(5): Dropout(p=0.5)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
), GlobalPool(
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(maxpool): AdaptiveMaxPool2d(output_size=(1, 1))
(exp_pool): ExpPool()
(linear_pool): LinearPool()
(lse_pool): LogSumExpPool()
), Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)), Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)), Conv2d(1
024, 1, kernel_size=(1, 1), stride=(1, 1)), Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)), Conv2d(1024, 1, ker
nel_size=(1, 1), stride=(1, 1)), Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)), Conv2d(1024, 1, kernel_size=(1
, 1), stride=(1, 1)), Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1)), BatchNorm2d(1024, eps=1e-05, momentum=0.1
, affine=True, track_running_stats=True), BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_sta
ts=True), BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True), BatchNorm2d(1024, eps=
1e-05, momentum=0.1, affine=True, track_running_stats=True), BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True
, track_running_stats=True), BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True), Bat
chNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True), BatchNorm2d(1024, eps=1e-05, moment
um=0.1, affine=True, track_running_stats=True), AttentionMap(
(channel_attention): CAModule(
(fc1): Linear(in_features=512, out_features=256, bias=True)
(fc2): Linear(in_features=256, out_features=512, bias=True)
(relu): ReLU()
(sigmoid): Sigmoid()
)
(spatial_attention): SAModule(
(conv1): Conv2d(512, 64, kernel_size=(1, 1), stride=(1, 1))
(conv2): Conv2d(512, 64, kernel_size=(1, 1), stride=(1, 1))
(conv3): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
)
(pyramid_attention): FPAModule(
(gap_branch): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
)
(mid_branch): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(downsample1): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(512, 1, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(downsample2): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(downsample3): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(scale1): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(scale2): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(scale3): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
)
)]

ptrblck · March 29, 2020, 11:10pm

What errors are you seeing and what is not working using my code snippet?

Hamada_Fathy · March 30, 2020, 8:21am

my issue:
1- view not has an attribute of tuple
2- When solving the above issue, i got another error of
RuntimeError: size mismatch, m1: [8 x 3], m2: [3000 x 8]

class MyEnsemble(nn.Module):
def init(self, model_1, model_2, model_3 , nb_classes=8):
super(MyEnsemble, self).init()
self.model_1 = model_1
self.model_2 = model_2
self.model_3 = model_3
# Remove last linear layer
self.model_1.classifier = nn.Identity()
self.model_2.classifier = nn.Identity()
self.model_3.classifier = nn.Identity()
self.classifier = nn.Linear(1000+1000+1000, 8)

def forward(self, x):
    x1 = self.model_1(x.clone())  # clone to make sure x is not changed by inplace 
    x1 = [[item] for t in x1 for item in t]
    x1= torch.tensor(x1)
    x1 = x1.view(x1.size(0), -1)
    x2 = self.model_2(x)
    x2 = [[item] for t in x2 for item in t]
    x2= torch.tensor(x2)
    x2 = x2.view(x2.size(0), -1)
    x3 = self.model_3(x)
    x3 = [[item] for t in x3 for item in t]
    x3= torch.tensor(x3)
    x3 = x3.view(x3.size(0), -1)
    x = torch.cat((x1, x2,x3), dim=1)
    print(x.shape)
    x=F.relu(x)
    print(x.shape)
    x = self.classifier(x)
    return x

That’s Your code after modifying it.

3- Here is myensemble model summary
[Classifier(
(backbone): VGG(
(features): Sequential(
(0): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
(3): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(4): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(5): ReLU(inplace)
(6): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(7): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(8): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(9): ReLU(inplace)
(10): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(11): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(12): ReLU(inplace)
(13): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(14): Conv2d(128, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(15): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(16): ReLU(inplace)
(17): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(18): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(19): ReLU(inplace)
(20): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(21): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(22): ReLU(inplace)
(23): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(24): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(25): ReLU(inplace)
(26): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(27): Conv2d(256, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(28): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(29): ReLU(inplace)
(30): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(31): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(32): ReLU(inplace)
(33): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(34): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(35): ReLU(inplace)
(36): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(37): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(38): ReLU(inplace)
(39): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
(40): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(41): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(42): ReLU(inplace)
(43): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(44): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(45): ReLU(inplace)
(46): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(47): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(48): ReLU(inplace)
(49): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(50): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(51): ReLU(inplace)
(52): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
)
(avgpool): AdaptiveAvgPool2d(output_size=(7, 7))
(classifier): Sequential(
(0): Linear(in_features=25088, out_features=4096, bias=True)
(1): ReLU(inplace)
(2): Dropout(p=0.5)
(3): Linear(in_features=4096, out_features=4096, bias=True)
(4): ReLU(inplace)
(5): Dropout(p=0.5)
(6): Linear(in_features=4096, out_features=1000, bias=True)
)
)
(global_pool): GlobalPool(
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(maxpool): AdaptiveMaxPool2d(output_size=(1, 1))
(exp_pool): ExpPool()
(linear_pool): LinearPool()
(lse_pool): LogSumExpPool()
)
(fc_0): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1))
(fc_1): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1))
(fc_2): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1))
(fc_3): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1))
(fc_4): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1))
(fc_5): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1))
(fc_6): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1))
(fc_7): Conv2d(1024, 1, kernel_size=(1, 1), stride=(1, 1))
(bn_0): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn_1): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn_2): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn_3): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn_4): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn_5): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn_6): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(bn_7): BatchNorm2d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(attention_map): AttentionMap(
(channel_attention): CAModule(
(fc1): Linear(in_features=512, out_features=256, bias=True)
(fc2): Linear(in_features=256, out_features=512, bias=True)
(relu): ReLU()
(sigmoid): Sigmoid()
)
(spatial_attention): SAModule(
(conv1): Conv2d(512, 64, kernel_size=(1, 1), stride=(1, 1))
(conv2): Conv2d(512, 64, kernel_size=(1, 1), stride=(1, 1))
(conv3): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
)
(pyramid_attention): FPAModule(
(gap_branch): Sequential(
(0): AdaptiveAvgPool2d(output_size=1)
(1): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
)
(mid_branch): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(512, 512, kernel_size=(1, 1), stride=(1, 1))
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(downsample1): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(512, 1, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(downsample2): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(scale1): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(7, 7), stride=(1, 1), padding=(3, 3))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(scale2): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
(scale3): Conv2dNormRelu(
(conv): Sequential(
(0): Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
(1): BatchNorm2d(1, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace)
)
)
)
)
(classifier): Identity()
), Linear(in_features=3000, out_features=8, bias=True)]

ptrblck · March 31, 2020, 4:06am

Could you post an executable code snippet by wrapping in into three backticks ```?
This would make debugging easier.

Also, note that recreating tensors as in:

x1= torch.tensor(x1)

will break the computation graph and model_1 won’t be trained.

Hamada_Fathy · April 2, 2020, 3:10pm

this code snippet is to ensemble my models during training or to ensemble them on the inference ??

Here is my code snippet:

class MyEnsemble(nn.Module) :

    def init(self, model_1, model_2, model_3 , nb_classes=8):
        super(MyEnsemble, self).init()
        self.model_1 = model_1
        self.model_2 = model_2
        self.model_3 = model_3
        # Remove last linear layer
        self.model_1.classifier = nn.Identity()
        self.model_2.classifier = nn.Identity()
        self.model_3.classifier = nn.Identity()
        self.classifier = nn.Linear(1000+1000+1000, 8)

    def forward(self, x):
        x1 = self.model_1(x.clone())  # clone to make sure x is not changed by inplace 
        x1 = [[item] for t in x1 for item in t]
        x1= torch.tensor(x1)
        x1 = x1.view(x1.size(0), -1)
        x2 = self.model_2(x)
        x2 = [[item] for t in x2 for item in t]
        x2= torch.tensor(x2)
        x2 = x2.view(x2.size(0), -1)
        x3 = self.model_3(x)
        x3 = [[item] for t in x3 for item in t]
        x3= torch.tensor(x3)
        x3 = x3.view(x3.size(0), -1)
        x = torch.cat((x1, x2,x3), dim=1)
        print(x.shape)
        x=F.relu(x)
        print(x.shape)
        x = self.classifier(x)
        return x

ptrblck · April 2, 2020, 8:10pm

The posted example code snippet just concatenates the penultimate activations of multiple models and feed them to a final classifier and can be used during training and inference.

However, as explained before, your code recreates tensors via x1 = torch.tensor(x1), which will break the computation graph. Thus I assume you would like to use this approach during inference only?

Hamada_Fathy · April 2, 2020, 8:41pm

Yes, I would like to use this approach on inference.

Okay, I will work on

  X1=torch.tensor(x1)

But I tried to follow your idea on how to solve the issue of getting tuple and how to convert it back to a tensor. But, I will be happy to hear again your suggestation

ptrblck · April 2, 2020, 8:49pm

Depending on what you submodules return, you might just use torch.stack or torch.cat.
Let me know, if that would work.

Hamada_Fathy · April 2, 2020, 10:03pm

that’s how x1 looks like:

([tensor([[-0.7520]], grad_fn=<SqueezeBackward1>), tensor([[-0.1090]], grad_fn=<SqueezeBackward1>), tensor([[-0.534
7]], grad_fn=<SqueezeBackward1>), tensor([[-0.2297]], grad_fn=<SqueezeBackward1>), tensor([[-0.2252]], grad_fn=<Squ
eezeBackward1>), tensor([[-0.1535]], grad_fn=<SqueezeBackward1>), tensor([[-1.0169]], grad_fn=<SqueezeBackward1>), 
tensor([[-0.5311]], grad_fn=<SqueezeBackward1>)], [])

and when trying torch.stack i got the following error:

File "/home/tfradai_gmail_com/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py", line
 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/dataset/DataSets/fastai-v3/app/ensemble.py", line 34, in forward
    x1= torch.stack(x1)
TypeError: expected Tensor as element 0 in argument 0, but got list

when trying x=torch.stack((x1,x2,x3),dim=1) i got the following error:

File "/home/tfradai_gmail_com/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py", line
 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/media/dataset/DataSets/fastai-v3/app/ensemble.py", line 42, in forward
    x= torch.stack((x1,x2,x3),dim=1)
TypeError: expected Tensor as element 0 in argument 0, but got tuple

and the same for torch.cat()

ptrblck · April 2, 2020, 10:09pm

Based on the output, x1 seems to be a tuple containing a list of tensors and an empty list at the end.
I’m not sure, what the empty list is used for, but this code should work:

x = ([tensor([[-0.7520]]), tensor([[-0.1090]]), tensor([[-0.5347]]),
      tensor([[-0.2297]]), tensor([[-0.2252]]), tensor([[-0.1535]]),
      tensor([[-1.0169]]), tensor([[-0.5311]])], [])

torch.stack(x[0])
> tensor([[[-0.7520]],

        [[-0.1090]],

        [[-0.5347]],

        [[-0.2297]],

        [[-0.2252]],

        [[-0.1535]],

        [[-1.0169]],

        [[-0.5311]]])

Hamada_Fathy · April 2, 2020, 10:30pm

it worked!!

but got the dimension mismatch error

RuntimeError: size mismatch, m1: [24 x 1], m2: [3000 x 8]

ptrblck · April 3, 2020, 2:55am

Which layer is raising this error?
Based on the provided code snippet, x should be the concatenated tensor of (x1, x2, x3), which would be strange, as the error points to a x.size(1) == 1.