Version 4 (modified by 9 years ago) ( diff ) | ,
---|
Code Debugging
Ideally, code should be debuged on your desktop computer before being moved to a cluster environment. There are a number of debugging techniques, which you can learn from the internet.
Insert 'print' into the source code.
in C/C++
/* check */ #ifdef DEBUG if (info == 0) printf("successfully done\n"); #endif
in Fortran
#ifdef debug if (info == 0) then print *,"successfully done" endif #endif
Compile with -DDEBUG option
icc -g -pg -DDEBUG -c stokeslet2d.c
Makefile
# # CCS WORKSHOP # Stokes Flow in a Cavity # # Makefile # # TARGET = ex32s ex32m # ALL: $(TARGET) # CC = icc #CFLAGS = -O3 CFLAGS = -g -pg -DDEBUG # # # SRC_EX32c = ex32.c stokeslet2d.c gmres.c # # MKL_SQ_LIBS = -L$(MKLROOT)/lib/intel64/ \ -I$(MKLROOT)/mkl/include \ -Wl,--start-group \ $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \ $(MKLROOT)/lib/intel64/libmkl_sequential.a \ $(MKLROOT)/lib/intel64/libmkl_core.a \ -Wl,--end-group \ -lpthread # MKL_MT_LIBS = -L$(MKLROOT)/lib/intel64/ \ -I$(MKLROOT)/mkl/include \ -Wl,--start-group \ $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \ $(MKLROOT)/lib/intel64/libmkl_intel_thread.a \ $(MKLROOT)/lib/intel64/libmkl_core.a \ -Wl,--end-group \ -liomp5 \ -lpthread # # # OBJ_EX32c = $(SRC_EX32c:.c=.o) # # ex32s : $(OBJ_EX32c) $(CC) $(CFLAGS) -o $@ $(OBJ_EX32c) $(MKL_SQ_LIBS) ex32m : $(OBJ_EX32c) $(CC) $(CFLAGS) -o $@ $(OBJ_EX32c) $(MKL_MT_LIBS) # # %.o : %.c $(CC) $(CFLAGS) -c $< # clean: rm -f *.o $(TARGET)
GDB
http://www.gnu.org/software/gdb/
To debug with GDB, submit an interactive job. See here
Compiling with -g option
icc -g -pg -DDEBUG -c stokeslet2d.c
run gdb
user@host>gdb ./ex32s GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6) Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /ccs-autofs/u01/fuji/LabWork/FlowInCavity/ex32s...done. (gdb)
show source by command, list line#
(gdb) list 44 39 printf("Usage:%s [Depth of Cavity]\n",argv[0]); 40 exit(-1); 41 } 42 43 /* get inputed depth */ 44 dp = atof(argv[1]); 45 46 /* # of particles in depth */ 47 numpdepth = (int)(dp / EPSILON + 0.5); 48 (gdb)
set breakpoint by command, b line#
(gdb) b 47 Breakpoint 1 at 0x4044c2: file ex32.c, line 47. (gdb)
run [command line option]
(gdb) run 5 Starting program: /ccs-autofs/u01/fuji/LabWork/FlowInCavity/ex32s 1 [Thread debugging using libthread_db enabled] Breakpoint 1, main (argc=2, argv=0x7fffffffd5c8) at ex32.c:47 47 numpdepth = (int)(dp / EPSILON + 0.5); Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.3.x86_64 (gdb)
print values
(gdb) p dp $1 = 5 (gdb) p numpdepth $2 = 0 (gdb)
continue one step
(gdb) next 50 numpwidth = (int)(1.0 / EPSILON + 0.5); (gdb) p numpdepth $3 = 1000 (gdb)
exit
(gdb) quit A debugging session is active. Inferior 1 [process 6222] will be killed. Quit anyway? (y or n) y
Valgrind
Valgrind tools can detect many memory management and threading bugs, and profile your programs in detail.
Detect Invalid Access
Example code: (this code has a bug)
#include <stdio.h> #include <stdlib.h> #include <string.h> char * foo() { char a[200]; strcpy(a, "hello world cup\n"); return a; } int main() { char * a = foo(); char c = a[0]; printf("a[0] = %c\n", c); printf("a = %s\n", a); return 0; }
Start an interactive session,
[fuji@cypress2 ~]$ idev -c 1 --gres=mic:0 Requesting 1 node(s) task(s) to workshop queue of workshop partition 1 task(s)/node, 1 cpu(s)/task, mic:0 MIC device(s)/node Time: 0 (hr) 60 (min). Submitted batch job 52605 JOBID=52605 begin on cypress01-089 --> Creating interactive terminal session (login) on node cypress01-089. --> You have 0 (hr) 60 (min). Last login: Wed Aug 19 21:05:45 2015 from cypress2.cm.cluster [fuji@cypress01-089 ~]$
compile and run,
[fuji@cypress01-089 ~]$ module load intel-psxe/2015-update1 [fuji@cypress01-089 ~]$ icc off_stack.c off_stack.c(8): warning #1251: returning pointer to local variable return a; ^ [fuji@cypress01-089 ~]$ ./a.out a[0] = h a = hello world cup
[fuji@cypress01-089 ~]$ icc -O0 -g off_stack.c off_stack.c(8): warning #1251: returning pointer to local variable return a; ^ [fuji@cypress01-089 ~]$ valgrind ./a.out ==33367== Memcheck, a memory error detector ==33367== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==33367== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==33367== Command: ./a.out ==33367== ==33367== Invalid read of size 1 ==33367== at 0x4005C5: main (off_stack.c:13) ==33367== Address 0x7feffdd50 is just below the stack ptr. To suppress, use: --workaround-gcc296-bugs=yes ==33367== a[0] = h a = ==33367== ==33367== HEAP SUMMARY: ==33367== in use at exit: 0 bytes in 0 blocks ==33367== total heap usage: 0 allocs, 0 frees, 0 bytes allocated ==33367== ==33367== All heap blocks were freed -- no leaks are possible ==33367== ==33367== For counts of detected and suppressed errors, rerun with: -v ==33367== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6) [fuji@cypress01-089 ~]$
Detect Uninitialized Data Access
Example code: (this code has a bug)
#include <stdio.h> #include <stdlib.h> int main() { double * p = malloc(sizeof(double) * 10); if (p[0] < 1) { printf("p[0] < 1\n"); } else { printf("p[1] >= 1\n"); } return 0; }
[fuji@cypress01-089 Valgrind]$ icc uninit.c [fuji@cypress01-089 Valgrind]$ ./a.out p[0] < 1 [fuji@cypress01-089 Valgrind]$ icc -O0 -g uninit.c [fuji@cypress01-089 Valgrind]$ valgrind ./a.out ==34643== Memcheck, a memory error detector ==34643== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==34643== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==34643== Command: ./a.out ==34643== ==34643== Conditional jump or move depends on uninitialised value(s) ==34643== at 0x4005A9: main (uninit.c:6) ==34643== ==34643== Conditional jump or move depends on uninitialised value(s) ==34643== at 0x4005AB: main (uninit.c:6) ==34643== p[0] < 1 ==34643== ==34643== HEAP SUMMARY: ==34643== in use at exit: 80 bytes in 1 blocks ==34643== total heap usage: 1 allocs, 0 frees, 80 bytes allocated ==34643== ==34643== LEAK SUMMARY: ==34643== definitely lost: 80 bytes in 1 blocks ==34643== indirectly lost: 0 bytes in 0 blocks ==34643== possibly lost: 0 bytes in 0 blocks ==34643== still reachable: 0 bytes in 0 blocks ==34643== suppressed: 0 bytes in 0 blocks ==34643== Rerun with --leak-check=full to see details of leaked memory ==34643== ==34643== For counts of detected and suppressed errors, rerun with: -v ==34643== Use --track-origins=yes to see where uninitialised values come from ==34643== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 6 from 6)