Error when executing resnet-training

./bin/resnet-training -backend=CPU -resnet50=resnet50 file.txt

contents of file.txt:
tests/images/imagenet/cat_285.png,285
tests/images/imagenet/dog_207.png,207
tests/images/imagenet/zebra_340.png,340

here is the error:

Loading resnet50 model.
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0316 18:32:00.168864 11616 Graph.cpp:5880] Layout requirements checking is enabled
Preparing for training.
Differentiating graph.
Compiling backend.
I0316 18:32:00.199743 11616 Partitioner.cpp:482] The model is too small for applying partition.
Model size : 103058848
Backend Name : CPU
Device memory: 2000000000
I0316 18:32:00.702248 11616 Partitioner.cpp:88] The number of partitions is : 2
I0316 18:32:00.702265 11616 PartitionerUtils.cpp:549] Partition 0:
Name : resnet50
BackendKind : CPU
context count : 1
total Memory : 103062848
input size: 103042720
input count : 268
input only from peers count : 267
output size: 4000
constant size: 16128
I0316 18:32:00.702288 11616 PartitionerUtils.cpp:570] LogicalDeviceIDs : 0
I0316 18:32:00.702291 11616 PartitionerUtils.cpp:549] Partition 1:
Name : resnet50_grad
BackendKind : CPU
context count : 1
total Memory : 409959712
input size: 205270848
input count : 429
input only from peers count : 0
output size: 204672736
constant size: 16128
I0316 18:32:00.702301 11616 PartitionerUtils.cpp:570] LogicalDeviceIDs : 0
Training - epoch #0 from total 32
Segmentation fault (core dumped)

when i tried with the opencl backend:

Loading resnet50 model.
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0316 21:45:21.926693 31914 Graph.cpp:5880] Layout requirements checking is enabled
Preparing for training.
Differentiating graph.
Compiling backend.
I0316 21:45:21.972465 31914 Partitioner.cpp:482] The model is too small for applying partition.
Model size : 103058848
Backend Name : OpenCL
Device memory: 16943087616
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: AvgPoolGrad
name : gpu_0_pool5__1_grad
Input : float<1 x 7 x 7 x 2048>
OriginalOutputForResult : float<1 x 1 x 1 x 2048>
GradOfOriginalOutputNamedResult : float<1 x 1 x 1 x 2048>
Kernels : [7, 7]
Strides : [1, 1]
Pads : [0, 0, 0, 0]
Layout : 0
CountIncludePads : 1
users : 1
GradOfInputNamedInput : float<1 x 7 x 7 x 2048>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_2_branch2c__6
Input : bool<1 x 7 x 7 x 2048>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 2048 x 7 x 7>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_2_branch2b__5
Input : bool<1 x 7 x 7 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 7 x 7>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_2_branch2a__5
Input : bool<1 x 7 x 7 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 7 x 7>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_1_branch2c__6
Input : bool<1 x 7 x 7 x 2048>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 2048 x 7 x 7>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_1_branch2b__5
Input : bool<1 x 7 x 7 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 7 x 7>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_1_branch2a__5
Input : bool<1 x 7 x 7 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 7 x 7>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_0_branch2c__6
Input : bool<1 x 7 x 7 x 2048>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 2048 x 7 x 7>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_0_branch2b__5
Input : bool<1 x 7 x 7 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 7 x 7>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res5_0_branch2a__5
Input : bool<1 x 14 x 14 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_5_branch2c__6
Input : bool<1 x 14 x 14 x 1024>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 1024 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_5_branch2b__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_5_branch2a__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_4_branch2c__6
Input : bool<1 x 14 x 14 x 1024>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 1024 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_4_branch2b__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_4_branch2a__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_3_branch2c__6
Input : bool<1 x 14 x 14 x 1024>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 1024 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_3_branch2b__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_3_branch2a__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_2_branch2c__6
Input : bool<1 x 14 x 14 x 1024>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 1024 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_2_branch2b__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_2_branch2a__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_1_branch2c__6
Input : bool<1 x 14 x 14 x 1024>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 1024 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_1_branch2b__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_1_branch2a__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_0_branch2c__6
Input : bool<1 x 14 x 14 x 1024>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 1024 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_0_branch2b__5
Input : bool<1 x 14 x 14 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 14 x 14>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res4_0_branch2a__5
Input : bool<1 x 28 x 28 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_3_branch2c__6
Input : bool<1 x 28 x 28 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_3_branch2b__5
Input : bool<1 x 28 x 28 x 128>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 128 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_3_branch2a__5
Input : bool<1 x 28 x 28 x 128>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 128 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_2_branch2c__6
Input : bool<1 x 28 x 28 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_2_branch2b__5
Input : bool<1 x 28 x 28 x 128>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 128 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_2_branch2a__5
Input : bool<1 x 28 x 28 x 128>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 128 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_1_branch2c__6
Input : bool<1 x 28 x 28 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_1_branch2b__5
Input : bool<1 x 28 x 28 x 128>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 128 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_1_branch2a__5
Input : bool<1 x 28 x 28 x 128>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 128 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_0_branch2c__6
Input : bool<1 x 28 x 28 x 512>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 512 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_0_branch2b__5
Input : bool<1 x 28 x 28 x 128>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 128 x 28 x 28>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res3_0_branch2a__5
Input : bool<1 x 56 x 56 x 128>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 128 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_2_branch2c__6
Input : bool<1 x 56 x 56 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_2_branch2b__5
Input : bool<1 x 56 x 56 x 64>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 64 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_2_branch2a__5
Input : bool<1 x 56 x 56 x 64>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 64 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_1_branch2c__6
Input : bool<1 x 56 x 56 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_1_branch2b__5
Input : bool<1 x 56 x 56 x 64>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 64 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_1_branch2a__5
Input : bool<1 x 56 x 56 x 64>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 64 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_0_branch2c__6
Input : bool<1 x 56 x 56 x 256>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 256 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_0_branch2b__5
Input : bool<1 x 56 x 56 x 64>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 64 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_res2_0_branch2a__5
Input : bool<1 x 56 x 56 x 64>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 64 x 56 x 56>
Unsupported node found while compiling Function resnet50_grad for backend OpenCL: Transpose
name : gpu_0_conv1__5
Input : bool<1 x 112 x 112 x 64>
Shuffle : [0, 3, 1, 2]
Layout : NCHW
users : 1
Result : bool<1 x 64 x 112 x 112>
F0316 21:45:22.983700 31914 Error.cpp:121] exitOnError(Error) got an unexpected ErrorValue:
Error code: COMPILE_UNSUPPORTED_NODE_AFTER_OPTIMIZE
Error message: Unsupported node(s) found after optimizing Function resnet50_grad for backend OpenCL

So, a few things:

  • Running on CPU and seg faulting clearly is bad. I haven’t tried reproducing this yet, but curious if you have tried running in dbg mode to see if an assert is hit, or if you have a stack trace.
  • On OpenCL, it seems AvgPoolGrad isn’t supported for the OpenCL backend. So that would need to be added to use the OpenCL backend.
  • Also on OpenCL it looks like there are a lot of transposes of booleans. That would also need to be added (should be very simple since there is already ElemKind::Int8QTy support).

In general Glow is more focused on inference than training as of late, so I imagine there isn’t nearly as much coverage on the training side of things. Apologies in advance!