| 1 | = Code Profiling = |
| 2 | To use [[http://en.wikipedia.org/wiki/Gprof|gprof]], compile your code with ''-pg'' option. For example, when compiling '''ex32.c''' with Intel MKL, |
| 3 | {{{ |
| 4 | user@host> ifort -pg ex32.f90 stokeslet2d.f90 -L$MKLROOT//lib/intel64/ \ |
| 5 | -I$MKLROOT/mkl/include \ |
| 6 | -Wl,--start-group $MKLROOT/lib/intel64/libmkl_intel_lp64.a \ |
| 7 | $MKLROOT/lib/intel64/libmkl_sequential.a $MKLROOT/lib/intel64/libmkl_core.a \ |
| 8 | -Wl,--end-group -lpthread |
| 9 | }}} |
| 10 | then, run it to generate the profiling information, |
| 11 | {{{ |
| 12 | user@host>./a.out 5 |
| 13 | Total # of Particles= 2400 |
| 14 | setting matrix 0.518922030925751 (sec) |
| 15 | Sovle linear ststem 15.6426219940186 (sec) |
| 16 | |b - Ax|= 4.422628714702772E-013 |
| 17 | Compute internal velocity 3.97039604187012 (sec) |
| 18 | }}} |
| 19 | When it finishes, '''gmon.out''' is created. |
| 20 | {{{ |
| 21 | user@host>ls |
| 22 | a.out ex32.f90 ex32.o gmon.out gmres.f90~ particle.dat res.dat stokeslet2d_dist.c stokeslet2d.f90 stokeslet2d.h stokeslet2d.o |
| 23 | ex32.c ex32.f90~ ex43.c gmres.c gmres.h README.txt stokeslet2d.c stokeslet2d_dist.h stokeslet2d.f90~ stokeslet2d.mod |
| 24 | }}} |
| 25 | To see the profiling results, use the command '''gprof''' as |
| 26 | {{{ |
| 27 | user@host>gprof a.out |
| 28 | Flat profile: |
| 29 | |
| 30 | Each sample counts as 0.01 seconds. |
| 31 | % cumulative self self total |
| 32 | time seconds seconds calls s/call s/call name |
| 33 | 65.36 13.17 13.17 LN12_M2_LOOPgas_1 |
| 34 | 7.84 14.75 1.58 36240000 0.00 0.00 stokeslet2d_mp_term2_ |
| 35 | 4.71 15.70 0.95 1 0.95 2.77 stokeslet2d_mp_slet2d_velocity_ |
| 36 | 4.22 16.55 0.85 mkl_blas_def_dgemm_copyan |
| 37 | 3.82 17.32 0.77 log.A |
| 38 | 2.90 17.91 0.59 36240000 0.00 0.00 stokeslet2d_mp_term1_ |
| 39 | 1.96 18.30 0.40 2 0.20 0.37 stokeslet2d_mp_slet2d_mkmatrix_ |
| 40 | You can see most of time spent in "LN12_M2_LOOPgas_1", which is a routine in the Intel math library. |
| 41 | }}} |
| 42 | See [[http://www.cs.utah.edu/dept/old/texinfo/as/gprof_toc.html|here]] for details. |