Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C [libcuda.so.1+0x1302b0]
C [libcuda.so.1+0x332420]
C [libtornado-ptx.so+0x64b8] Java_uk_ac_manchester_tornado_drivers_ptx_PTXStream_cuLaunchKernel+0x198
j uk.ac.manchester.tornado.drivers.ptx.PTXStream.cuLaunchKernel([BLjava/lang/String;IIIIIIJ[B[B)[[B+0 tornado.drivers.ptx@2.2.1-dev
...
Now, finding the Shared Object files (*.so), I called:
objdump -d /usr/lib/x86_64-linux-gnu/libcuda.so.1
objdump -d /usr/local/bin/Java/tornadovm-2.2.1-dev-ptx/lib/libtornado-ptx.so
and looked at the addresses in the stack dump.
First, libtornado-ptx.so. Note that the address (0x64b8) is the return address from a call, that is the next line after the call that went Pete Tong.
64b3: e8 b8 e1 ff ff call 4670 <cuLaunchKernel@plt>
64b8: 48 83 c4 30 add $0x30,%rsp
So, it's the call to cuLaunchKernel that is interesting.
33241b: e8 00 de df ff call 130220 <exit@plt+0x4e460>
332420: 5a pop %rdx
and the final (top most) stack frame:
1302ab: 4d 85 e4 test %r12,%r12
1302ae: 74 58 je 130308 <exit@plt+0x4e548>
1302b0: 41 8b 04 24 mov (%r12),%eax
The instruction test %x,%y is a common idiom in null checks (basically, it's x and y are ANDed and the je jumps if the Zero Flag is set - note that this flag is set if the result of the AND is non-zero or both x and y are zero).
So, it looks like we've essentially got what's equivalent to a NullPointerException in the machine code. Still looking at what's null... [Solved: had to use a model that is compatible with GPULlama3.java)
No comments:
Post a Comment