Hello -
I am trying to fine tune the “Segment Anything” model released by Facebook. Specifically, I am trying to tune a fork of the model that has some code for fine-tuning on medical images, termed MedSAM (found here). When trying to run the code, I get the following error:
Traceback (most recent call last):
File "/mnt/beegfs/khans24/medsam_finetuning/minimal.py", line 30, in <module>
embedding = sam_model.image_encoder(input_image)
File "/home/khans24/beegfs/miniconda/envs/sam/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/beegfs/khans24/segment-anything/segment_anything/modeling/image_encoder.py", line 107, in forward
x = self.patch_embed(x)
File "/home/khans24/beegfs/miniconda/envs/sam/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/mnt/beegfs/khans24/segment-anything/segment_anything/modeling/image_encoder.py", line 392, in forward
x = self.proj(x)
File "/home/khans24/beegfs/miniconda/envs/sam/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/khans24/beegfs/miniconda/envs/sam/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/home/khans24/beegfs/miniconda/envs/sam/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: GET was unable to find an engine to execute this computation
I’ve tried to troubleshoot this many times by switching with different CUDA and cuDNN versions, but to no avail. Here are the specs of the system I am running it on:
- PyTorch 2.0 (but also tried with 1.13)
- Runtime CUDA 11.0
- Runtime cuDNN 8.2.1
Output of torch.backends.cudnn.version()
is 8500
I get positive results from running the following code:
print(torch.cuda.is_available())
print(torch.cuda.device_count())
print(torch.backends.cudnn.enabled)
print(torch.backends.cudnn.version())
Output is True, 4, True, 8500
I am also running this on a high-performance computing cluster, the OS is RedHat Linux. I am also using the “modules” package to load in the CUDA toolkit from modulefiles. I have limited ability to install things since I do not have root access.
Here is also a minimal reproducible example:
import torch
import numpy as np
from skimage import io, transform
from segment_anything import SamPredictor, sam_model_registry
from segment_anything.utils.transforms import ResizeLongestSide
# Set up the model and device
model_type = 'vit_b'
checkpoint = 'load/medsam_20230423_vit_b_0.0.1.pth'
device = 'cuda:0'
sam_model = sam_model_registry[model_type](checkpoint=checkpoint).to(device)
# Generate a random image
image_size = 256
random_image = np.random.randint(0, 256, (image_size, image_size, 3), dtype=np.uint8)
# Resize the random image
sam_transform = ResizeLongestSide(sam_model.image_encoder.img_size)
resized_image = sam_transform.apply_image(random_image)
# Convert the resized image to a PyTorch tensor
resized_image_tensor = torch.as_tensor(resized_image.transpose(2, 0, 1)).to(device)
# Preprocess the image tensor
input_image = sam_model.preprocess(resized_image_tensor[None, :, :, :])
# Compute the image embedding using the sam_model
with torch.no_grad():
embedding = sam_model.image_encoder(input_image)
print(embedding.shape)
Any help is appreciated, I’ve been banging my head on this for days.
@ptrblack help!
Thanks all