When and why exactly is NNPack initialized?

Hi, I am running PyTorch’s nightly build inside a docker container running Linux on an M1 mac (macOS 12.4). The following code:

weights = ResNet18_Weights.DEFAULT

img = read_image(f"./test/data/torch_test/dog.jpeg")
preprocess = weights.transforms()
batch = preprocess(img).unsqueeze(0)
model = resnet18(weights=weights)
prediction = model(batch).squeeze(0).softmax(0)

weights = FasterRCNN_ResNet50_FPN_Weights.DEFAULT
model = fasterrcnn_resnet50_fpn(weights=FasterRCNN_ResNet50_FPN_Weights.DEFAULT)
boxes = model(batch)

Produces the warning Could not initialize NNPACK! Reason: Unsupported hardware, which makes sense considering I am computing convolutions on the M1, but the warning is not raised until the final line. What is different about fastrrcnn_resnet50_fpn from resnet18 that torch doesn’t attempt to initialize nnpack until the detection model is called, rather than the classification? It looks like under the hood they’re both normal Conv2ds.

For context, this failure to initialize seems to be causing a massive (>1000x) slowdown in inference for the detection model compared to inference with the classification model (I know the classification model should be faster, but not THAT much faster) and I’m trying to understand how the classification model gets by without initializing NNPack but the detection one needs it.