[[PageOutline]] = Code Debugging = Ideally, code should be debuged on your desktop computer before being moved to a cluster environment. There are a number of debugging techniques, which you can learn from the internet. == print == Insert 'print' into the source code. in C/C++ {{{#!c /* check */ #ifdef DEBUG if (info == 0) printf("successfully done\n"); #endif }}} in Fortran {{{#!fortran #ifdef debug if (info == 0) then print *,"successfully done" endif #endif }}} Compile with ''-DDEBUG'' option {{{ icc -g -pg -DDEBUG -c stokeslet2d.c }}} Makefile {{{#!make # # CCS WORKSHOP # Stokes Flow in a Cavity # # Makefile # # TARGET = ex32s ex32m # ALL: $(TARGET) # CC = icc #CFLAGS = -O3 CFLAGS = -g -pg -DDEBUG # # # SRC_EX32c = ex32.c stokeslet2d.c gmres.c # # MKL_SQ_LIBS = -L$(MKLROOT)/lib/intel64/ \ -I$(MKLROOT)/mkl/include \ -Wl,--start-group \ $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \ $(MKLROOT)/lib/intel64/libmkl_sequential.a \ $(MKLROOT)/lib/intel64/libmkl_core.a \ -Wl,--end-group \ -lpthread # MKL_MT_LIBS = -L$(MKLROOT)/lib/intel64/ \ -I$(MKLROOT)/mkl/include \ -Wl,--start-group \ $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \ $(MKLROOT)/lib/intel64/libmkl_intel_thread.a \ $(MKLROOT)/lib/intel64/libmkl_core.a \ -Wl,--end-group \ -liomp5 \ -lpthread # # # OBJ_EX32c = $(SRC_EX32c:.c=.o) # # ex32s : $(OBJ_EX32c) $(CC) $(CFLAGS) -o $@ $(OBJ_EX32c) $(MKL_SQ_LIBS) ex32m : $(OBJ_EX32c) $(CC) $(CFLAGS) -o $@ $(OBJ_EX32c) $(MKL_MT_LIBS) # # %.o : %.c $(CC) $(CFLAGS) -c $< # clean: rm -f *.o $(TARGET) }}} == GDB == GDB is the standard debugger. [http://www.gnu.org/software/gdb/documentation/] To debug with '''GDB''', submit an interactive job. [[https://wiki.hpc.tulane.edu/trac/wiki/cypress/using#SubmittingInteractiveJobs|See here]] Compiling with '''-g''' option {{{ icc -g -pg -DDEBUG -c stokeslet2d.c }}} run '''gdb''' {{{ user@host>gdb ./ex32s GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: ... Reading symbols from /ccs-autofs/u01/fuji/LabWork/FlowInCavity/ex32s...done. (gdb) }}} show source by command, list '''line#''' {{{ (gdb) list 44 39 printf("Usage:%s [Depth of Cavity]\n",argv[0]); 40 exit(-1); 41 } 42 43 /* get inputed depth */ 44 dp = atof(argv[1]); 45 46 /* # of particles in depth */ 47 numpdepth = (int)(dp / EPSILON + 0.5); 48 (gdb) }}} set breakpoint by command, '''b line#''' {{{ (gdb) b 47 Breakpoint 1 at 0x4044c2: file ex32.c, line 47. (gdb) }}} '''run [command line option]''' {{{ (gdb) run 5 Starting program: /ccs-autofs/u01/fuji/LabWork/FlowInCavity/ex32s 1 [Thread debugging using libthread_db enabled] Breakpoint 1, main (argc=2, argv=0x7fffffffd5c8) at ex32.c:47 47 numpdepth = (int)(dp / EPSILON + 0.5); Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.3.x86_64 (gdb) }}} print values {{{ (gdb) p dp $1 = 5 (gdb) p numpdepth $2 = 0 (gdb) }}} continue one step {{{ (gdb) next 50 numpwidth = (int)(1.0 / EPSILON + 0.5); (gdb) p numpdepth $3 = 1000 (gdb) }}} exit {{{ (gdb) quit A debugging session is active. Inferior 1 [process 6222] will be killed. Quit anyway? (y or n) y }}} == Valgrind == [http://valgrind.org/] ''Valgrind'' tools can detect many memory management and threading bugs, and profile your programs in detail. === Detect Invalid Access === Example code: (this code has a bug) {{{#!c #include #include #include char * foo() { char a[200]; strcpy(a, "hello world cup\n"); return a; } int main() { char * a = foo(); char c = a[0]; printf("a[0] = %c\n", c); printf("a = %s\n", a); return 0; } }}} Start an interactive session, {{{#!bash [fuji@cypress2 ~]$ idev -c 1 --gres=mic:0 Requesting 1 node(s) task(s) to workshop queue of workshop partition 1 task(s)/node, 1 cpu(s)/task, mic:0 MIC device(s)/node Time: 0 (hr) 60 (min). Submitted batch job 52605 JOBID=52605 begin on cypress01-089 --> Creating interactive terminal session (login) on node cypress01-089. --> You have 0 (hr) 60 (min). Last login: Wed Aug 19 21:05:45 2015 from cypress2.cm.cluster [fuji@cypress01-089 ~]$ }}} compile and run, {{{#!bash [fuji@cypress01-089 ~]$ module load intel-psxe/2015-update1 [fuji@cypress01-089 ~]$ icc off_stack.c off_stack.c(8): warning #1251: returning pointer to local variable return a; ^ [fuji@cypress01-089 ~]$ ./a.out a[0] = h a = hello world cup }}} {{{#!bash [fuji@cypress01-089 ~]$ icc -O0 -g off_stack.c off_stack.c(8): warning #1251: returning pointer to local variable return a; ^ [fuji@cypress01-089 ~]$ valgrind ./a.out ==33367== Memcheck, a memory error detector ==33367== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==33367== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==33367== Command: ./a.out ==33367== ==33367== Invalid read of size 1 ==33367== at 0x4005C5: main (off_stack.c:13) ==33367== Address 0x7feffdd50 is just below the stack ptr. To suppress, use: --workaround-gcc296-bugs=yes ==33367== a[0] = h a = ==33367== ==33367== HEAP SUMMARY: ==33367== in use at exit: 0 bytes in 0 blocks ==33367== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==33367== ==33367== All heap blocks were freed -- no leaks are possible ==33367== ==33367== For counts of detected and suppressed errors, rerun with: -v ==33367== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6) [fuji@cypress01-089 ~]$ }}} === Detect Uninitialized Data Access === Example code: (this code has a bug) {{{#!c #include #include int main() { double * p = malloc(sizeof(double) * 10); if (p[0] < 1) { printf("p[0] < 1\n"); } else { printf("p[1] >= 1\n"); } return 0; } }}} {{{#!bash [fuji@cypress01-089 Valgrind]$ icc uninit.c [fuji@cypress01-089 Valgrind]$ ./a.out p[0] < 1 [fuji@cypress01-089 Valgrind]$ icc -O0 -g uninit.c [fuji@cypress01-089 Valgrind]$ valgrind ./a.out ==34643== Memcheck, a memory error detector ==34643== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==34643== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==34643== Command: ./a.out ==34643== ==34643== Conditional jump or move depends on uninitialised value(s) ==34643== at 0x4005A9: main (uninit.c:6) ==34643== ==34643== Conditional jump or move depends on uninitialised value(s) ==34643== at 0x4005AB: main (uninit.c:6) ==34643== p[0] < 1 ==34643== ==34643== HEAP SUMMARY: ==34643== in use at exit: 80 bytes in 1 blocks ==34643== total heap usage: 1 allocs, 0 frees, 80 bytes allocated ==34643== ==34643== LEAK SUMMARY: ==34643== definitely lost: 80 bytes in 1 blocks ==34643== indirectly lost: 0 bytes in 0 blocks ==34643== possibly lost: 0 bytes in 0 blocks ==34643== still reachable: 0 bytes in 0 blocks ==34643== suppressed: 0 bytes in 0 blocks ==34643== Rerun with --leak-check=full to see details of leaked memory ==34643== ==34643== For counts of detected and suppressed errors, rerun with: -v ==34643== Use --track-origins=yes to see where uninitialised values come from ==34643== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 6 from 6) }}} === Detect Memory Leaks === Example code: (this code has a bug) {{{#!c++ #include #include char * foo() { char *a = new char[200]; std::strcpy(a, "hello workshop"); return a; } int main() { char * a = foo(); char * b = foo(); std::cout << "a = " << a << std::endl; std::cout << "b = " << b << std::endl; return 0; } }}} {{{#!bash [fuji@cypress1 TestCodes]$ icpc -g mleak.cpp [fuji@cypress1 TestCodes]$ valgrind --leak-check=full ./a.out ==10272== Memcheck, a memory error detector ==10272== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==10272== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==10272== Command: ./a.out ==10272== a = hello workshop b = hello workshop ==10272== ==10272== HEAP SUMMARY: ==10272== in use at exit: 400 bytes in 2 blocks ==10272== total heap usage: 2 allocs, 0 frees, 400 bytes allocated ==10272== ==10272== 200 bytes in 1 blocks are definitely lost in loss record 1 of 2 ==10272== at 0x4C28192: operator new[](unsigned long) (vg_replace_malloc.c:363) ==10272== by 0x4009D8: foo() (mleak.cpp:5) ==10272== by 0x400A0F: main (mleak.cpp:11) ==10272== ==10272== 200 bytes in 1 blocks are definitely lost in loss record 2 of 2 ==10272== at 0x4C28192: operator new[](unsigned long) (vg_replace_malloc.c:363) ==10272== by 0x4009D8: foo() (mleak.cpp:5) ==10272== by 0x400A20: main (mleak.cpp:12) ==10272== ==10272== LEAK SUMMARY: ==10272== definitely lost: 400 bytes in 2 blocks ==10272== indirectly lost: 0 bytes in 0 blocks ==10272== possibly lost: 0 bytes in 0 blocks ==10272== still reachable: 0 bytes in 0 blocks ==10272== suppressed: 0 bytes in 0 blocks ==10272== ==10272== For counts of detected and suppressed errors, rerun with: -v ==10272== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 6 from 6) }}} == Intel® Inspector XE == Memory and Thread Debugger: * Debug memory errors like leaks and allocation errors and threading errors like data races and deadlocks. ==== Setting Environment and Compiling your code ==== Load module to setup Intel compilers and tools. {{{#!bash [fuji@cypress1 ~]$ module load intel-psxe/2015-update1 }}} Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. {{{#!bash [fuji@cypress1 ~]$ icc -g -o mytest mytest.c }}} ==== Run and Collect Information ==== Start an interactive job, {{{#!bash [fuji@cypress1 ~]$ idev }}} To collect information, run the code, for example, {{{#!bash [fuji@cypress1 ~]$ inspxe-cl -collect=mi2 -app-working-dir=$PWD -result-dir=$PWD/results $PWD/mytest }}} '''-collect=''' options Memory error analysis types ||= mi1 =|| Detect memory leaks || ||= mi2 =|| Detect memory leaks and memory access problems || ||= mi3 =|| Find locations of memory leaks and memory access problems || Threading error analysis_types ||= ti1 =|| Detect deadlocks || ||= ti2 =|| Detect deadlocks and data races || ||= ti3 =|| Find locations of deadlocks and data races || To show results, for example, {{{#!bash [fuji@cypress1 ~]$ inspxe-cl -R problems -r $PWD/results }}} See [https://software.intel.com/en-us/node/528226 here] for details. [[Inspector Brief Tutorial]] == Intel® Advisor XE == Threading design and prototyping tool for software architects: * Analyze, design, tune and check your threading design before implementation * Explore and test threading options without disrupting normal development * Predict threading errors & performance scaling on systems with more cores === Survey === Survey the application to determine hotspots. Typically an optimized (non-debug) version of the application is used when surveying an application. Run and Collect info. {{{#!bash $ icc -g -O3 mycode.c $ advixe-cl --collect survey --project-dir ./advi ./a.out }}} Show report {{{#!bash $ advixe-cl --report survey --project-dir ./advi ./a.out }}} === Add Annotations === Add annotations to the application source code, and rebuild the application. Please see the Getting Started Tutorial for more information. For C/C++ {{{#!c #include "advisor-annotate.h" ..... ANNOTATE_SITE_BEGIN(sitename1); for ( .... { ANNOTATE_TASK_BEGIN(taskname1); ... ANNOTATE_TASK_END(); } ANNOTATE_SITE_END(); }}} Fortran {{{#!fortran use advisor_annotate ..... call annotate_site_begin(sitename1) do ..... call annotate_task_begin(taskname1) .... call annotate_task_end() enddo call annotate_site_end() }}} === Suitability === Collect suitability data. Note that annotations must be present in the source code for this collection to be successful. Typically an optimized (non-debug) version of the application is used when collecting suitability data. {{{#!bash $ icc -g -O3 mycode.c -I $ADVISOR_XE_2015_DIR/include $ advixe-cl --collect suitability --project-dir ./advi ./a.out }}} {{{#!bash $ advixe-cl --report suitability --project-dir ./advi ./a.out }}} === Correctness === Collect correctness data. Note that annotations must be present in the source code for this collection to be successful. Typically an application with debug symbols is used when collecting correctness data. {{{#!bash $ icc -g -O0 mycode.c $ advixe-cl --collect correctness --project-dir ./advi ./a.out }}} {{{#!bash $ advixe-cl --report correctness --project-dir ./advi ./a.out }}} Display a list of annotations present. {{{#!bash advixe-cl --report annotations --project-dir ./advi ./a.out }}} Update the application using the chosen parallel coding constructs. Rebuild the application and test. [[Advisor Brief Tutorial]] == Intel® VTune™ Amplifier 2015 == * Intuitive CPU & GPU performance tuning, multi-core scalability, bandwidth and more * Quick performance insight with advanced data visualization * Automate regression tests and collect data remotely Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. {{{#!bash [fuji@cypress1 ~]$ icc -g -o mytest mytest.c }}} ==== Run and Collect Information ==== Start an interactive job, {{{#!bash [fuji@cypress1 ~]$ idev }}} To collect information, run the code, for example, {{{#!bash [fuji@cypress1 ~]$ amplxe-cl -collect hotspot ./mytest }}} This will create a directory like '''r000hs'''. '''-collect ''' options ||= concurrency =|| Concurrency analysis || ||= hotspots =|| Hotspots analysis || ||= lightweight-hotspots =|| Lightweight Hotspots analysis || ||= locksandwaits =|| Locks and Waits analysis || To show results, for example, {{{#!bash [fuji@cypress1 ~]$ amplxe-cl -report hotspot -r r000hs }}} '''-report ''' options ||= summary =|| Display data for the overall performance of the target. || ||= hotspots =|| Display functions with the highest CPU time. || ||= wait-time =|| Display Wait time. || ||= perf =|| Display performance data for each module of the target. || ||= perf-detail =|| Display performance data for each function of the target. || ||= callstacks =|| Display CPU or Wait time for call stacks. || ||= top-down =|| Display a call tree for your target application and provide CPU and Wait time for each function. || ||= gprof-cc =|| Display CPU or wait time in the gprof-like format. || [[VTune Brief Tutorial]]