Thursday, November 24, 2011

CUDA_EXCEPTION_1: Lane illegal Address

It is usually caused by bad indexing.
Set cuda memcheck on in cuda-gdb

For non block/threads divisible  data size(BLOCK_SIZE), generally, there are twos ways to fix:
Pad the input with extra values to get to a round multiple of your block size, or add a bounds check to the kernel so that only threads in the valid index range do the calculations, and the others just skip them or return early. Be aware of the implications for block and warp level synchronization level primitives if you choose the second option.

From avidday, NVIDIA Forum:
http://forums.nvidia.com/index.php?showtopic=179190

2 comments: