Hi everyone,
I used the fine-tune model from unsloth, and I got this error. It ran ok for the first few loop, but it show up this errors after 2 mins.
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
also got this warning:
…/aten/src/ATen/native/cuda/IndexKernel.cu:92: operator(): block: [0,0,0], thread: [0,0,0] Assertion -sizes[i] <= index && index < sizes[i] && "index out of bounds"
failed.
Code that show up this error:
inputs = tokenizer(
[
alpaca_prompt.format(
instruction, # instruction
tables[table_i], # input
“”, # output - leave this blank for generation!
)
], return_tensors=“pt”).to(“cuda:0”)
An indexing operation is failing. Rerun your code via CUDA_LAUNCH_BLOCKING=1 python script.py args
to isolate the failing line of code.
Dear Ptrblck,
Thanks for your reply, after I ran this via CUDA_LAUNCH_BLOCKING=1, still got this:
File ~/anaconda3/envs/FTllama/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:800, in BatchEncoding.to(self, device)
796 # This check catches things like APEX blindly calling “to” on all inputs to a module
797 # Otherwise it passes the casts down and casts the LongTensor containing the token idxs
798 # into a HalfTensor
799 if isinstance(device, str) or is_torch_device(device) or isinstance(device, int):
→ 800 self.data = {k: v.to(device=device) for k, v in self.data.items()}
801 else:
802 logger.warning(f"Attempting to cast a BatchEncoding to type {str(device)}. This is not supported.")
File ~/anaconda3/envs/FTllama/lib/python3.11/site-packages/transformers/tokenization_utils_base.py:800, in (.0)
796 # This check catches things like APEX blindly calling “to” on all inputs to a module
797 # Otherwise it passes the casts down and casts the LongTensor containing the token idxs
798 # into a HalfTensor
799 if isinstance(device, str) or is_torch_device(device) or isinstance(device, int):
→ 800 self.data = {k: v.to(device=device) for k, v in self.data.items()}
801 else:
802 logger.warning(f"Attempting to cast a BatchEncoding to type {str(device)}. This is not supported.")
RuntimeError: CUDA error: device-side assert triggered
Compile with TORCH_USE_CUDA_DSA
to enable device-side assertions.
The stacktrace is unfortunately still wrong. Did you export
this env variable in your terminal? If so, check if any embedding layers are used as their input is often containing indices which are out of the valid range.
Dear ptrblck,
Thanks, notice that my input is out of the valid range