i run kohya lora trainining code and encounter CUDA index out of bounds problem,anyone can help me with this issue,thanks
Rerun the code with blocking launches and check the stacktrace to isolate the failing indexing operation. Once you know which op is failing, check the min./max. values of the index tensor as well as the shape of the tenors being indexed.
thank you, after i rerun the code,the training process succeed in the end,btw,how to monitor checking the stacktrace andisolating the failing indexing operation?i have no idea how to breakpoint the beginning of this issue,i just curious if the latent bucketing previous of lora training cause this problem
Run the code via
CUDA_LAUNCH_BLOCKING=1 python script.py args and check the stacktrace shown in your terminal to see which operation failed.