Request: Add CUDA sm_120 (Blackwell) support for ConvNeXtV2 / fused kernels

PyTorch does support sm_120 since the 2.7.0 release if built with CUDA 12.8 and as shown in:

torch.cuda.get_arch_list()
'sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90', 'sm_100', 'sm_120']

If you have any issues with custom kernels, please contact the authors of the repository directly.