Bigger tensor results in RunImpl error C++

So I’m passing an image tensor through traced C++ model, I just used a 1080 tensor for the tracing (if this has anything to do with it)+, but when I run in in C++ it gives an error if I make it bigger than ~280.

The error is

Microsoft C++ exception: std::runtime_error at memory location 0x000000395E2F6710.


try {
        loadTensorsFromRegisters(inst.inputs, stack);
        size_t new_pc = pc + 1 + inst.callback(stack);
        for (int i = inst.outputs.size - 1; i >= 0; --i) {
          int reg = get(inst.outputs, i);
          registers[reg] = pop(stack);
          // std::cout << "pop reg[" << reg << "];\n" << registers[reg] << "\n";
        pc = new_pc;
      } catch (Suspend& e) {
        // wait() expects a single input
        AT_ASSERT(inst.inputs.values.size == 1);


        if (get(inst.inputs.free_flags, 0)) {
          // make sure the register is not freed once we are waked up
          registers[get(inst.inputs.values, 0)] = e.future;

        // Make sure adding callback is the last step.
        // Otherwise if e.future has completed,
        // the current thread will continue running before it suspends.
        InterpreterState state(intrusive_from_this());
        e.future->addCallback([state]() {
          c10::global_work_queue().run(InterpreterContinuation(state, Stack(),

        return true;

at::Tensor tensor_contentimage = torch::rand({ 1, 3, 200, 200}).to(device_type).toType(at::kFloat); works fine, but
torch::rand({ 1, 3, 720, 720}).to(device_type).toType(at::kFloat);
fails, while the same size tensor worked fine in Python!
It’s something with the .ToTensor and intrusive pointer

Just any tips or advice on solving the problem will be greatly appreciated, thanks!

Suppose I need to use CUDA during the tracing…