How to jit compile with `cupy.cuda.compile_with_cache`

tommy19970714 · May 19, 2021, 3:14am

I’m trying to compile PF-AFN model to jit.
I got the following error when I compile the model with cupy.cuda.compile_with_cache to jit.

NVRTCError: NVRTC_ERROR_COMPILATION (6)

During handling of the above exception, another exception occurred:

CompileException                          Traceback (most recent call last)

cupy/util.pyx in cupy.util.memoize.decorator.ret()

/usr/local/lib/python3.7/dist-packages/cupy/cuda/compiler.py in compile(self, options)
    440         except nvrtc.NVRTCError:
    441             log = nvrtc.getProgramLog(self.ptr)
--> 442             raise CompileException(log, self.src, self.name, options, 'nvrtc')
    443 
    444 

CompileException: /tmp/tmpan1ut480/3b7c153ce98d06488f1cbac8793f6dff_2.cubin.cu(16): error: identifier "tensor" is undefined

1 error detected in the compilation of "/tmp/tmpan1ut480/3b7c153ce98d06488f1cbac8793f6dff_2.cubin.cu".

To Reproduce

This is a colab to reproduce the error.

This is a minimum code.

@cupy.util.memoize(for_each_device=True)
def cupy_launch(strFunction, strKernel):
	return cupy.cuda.compile_with_cache(strKernel).get_function(strFunction)

kernel_Correlation_rearrange = " .... "

import torch
import torch.nn as nn

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()

    def forward(self, x_warp_after, x_cond):
        cupy_launch('kernel_Correlation_rearrange', cupy_kernel('kernel_Correlation_rearrange', {
          'intStride': 1,
          'input': x_warp_after,
          'output': x_cond
        }))(
        )
        return x_warp_after, x_cond

net = Net().cuda()
input1 = torch.randn([1, 256, 8, 6]).cuda()
input2 = torch.randn([1, 256, 8, 6]).cuda()
trace_model = torch.jit.trace(net, [input1, input2])

Expected behavior

I think the above error occurs when I use cupy.cuda.compile_with_cache.

Environment

PyTorch Version (e.g., 1.0): 1.8.1+cu101
OS (e.g., Linux): Ubuntu 18.04.5 LTS (x86_64)
How you installed PyTorch (conda, pip, source): pip
Build command you used (if compiling from source): no
Python version: 3.7 (64-bit runtime)
CUDA/cuDNN version: 11.0.221
GPU models and configuration: GPU 0: Tesla T4
Any other relevant information:

ptrblck · May 19, 2021, 7:11am

Based on the error message it seems that cupy is unable to compile the PyTorch methods and I’m unsure if this would even be supported.
Do you have any resources claiming that this should work and some examples demonstrating it?

tommy19970714 · May 20, 2021, 7:48am

@ptrblck Thank you for your reply!
You can check this example from the following colab.

ptrblck · May 20, 2021, 5:43pm

Thanks for the code! It shows the error, which might be helpful for debugging, but my previous question was regarding the expectations that this would be supported. Do you have any working example, demos, blog posts etc., which explain how cupy can be used, as I’m unfamiliar with it?