Friday, 13 September 2013

Why NVPROF and Nsight not profiling one of the kernels?

Why NVPROF and Nsight not profiling one of the kernels?

I have this CFD program in cuda, which when I execute using block of
dimension 16 * 16 and profile it, it gets profiled perfectly and shows a
kernel "NLMMNT" to be taking most of the GPU time. But I execute the same
program using block dimension 32 * 32, the program accelerates upto 5
times faster than before, and the results of the program are correct, but
now the profiler is not showing the profiling output for NLMMNT. When I
see the log of Nsight, there also its not showing the profiling of NLMMNT
to be complete. I can figure what may be the reason, I tried running that
application for hours but still NLMMNT's profile info is absent from the
profiler's output.
Log of Nsight can be seen in this screenshot...
http://s23.postimg.org/hbaect7uz/profoutput.png
By the way I am facing the same problem in Nvprof in Nsight eclipse
edition as well .
I have also cross checked that the kernel is getting launched finished
succesfully by checking the cudaError_t status before and after launching
the kernel. I am facing the same problem in both Nsight on Visual Studio
Editon 2010, cuda toolkit 5.0, Geforce GT 520MX and Nvprof on Ubuntu
studio 12.10, Nsight Eclipse Edition, nvprof v 4.0 , cuda toolkit 5.5
Geforce GTX 480.

No comments:

Post a Comment