wiki:cypress/Programming/CodeDebugging

Version 4 (modified by fuji, 9 years ago) ( diff )

Code Debugging

Ideally, code should be debuged on your desktop computer before being moved to a cluster environment. There are a number of debugging techniques, which you can learn from the internet.

print

Insert 'print' into the source code.

in C/C++

 /* check */
#ifdef DEBUG
  if (info == 0) printf("successfully done\n"); 
#endif

in Fortran

#ifdef debug
    if (info == 0) then
       print *,"successfully done"
    endif
#endif

Compile with -DDEBUG option

icc -g -pg -DDEBUG -c stokeslet2d.c

Makefile

#
# CCS WORKSHOP
# Stokes Flow in a Cavity
#
# Makefile
#
#
TARGET = ex32s ex32m
#
ALL: $(TARGET)
#
CC = icc
 
#CFLAGS = -O3
CFLAGS = -g -pg -DDEBUG
#
#
#
SRC_EX32c = ex32.c stokeslet2d.c gmres.c
#
#
MKL_SQ_LIBS = -L$(MKLROOT)/lib/intel64/ \
        -I$(MKLROOT)/mkl/include \
        -Wl,--start-group \
        $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \
        $(MKLROOT)/lib/intel64/libmkl_sequential.a \
        $(MKLROOT)/lib/intel64/libmkl_core.a \
        -Wl,--end-group \
        -lpthread
#
MKL_MT_LIBS = -L$(MKLROOT)/lib/intel64/ \
        -I$(MKLROOT)/mkl/include \
        -Wl,--start-group \
        $(MKLROOT)/lib/intel64/libmkl_intel_lp64.a \
        $(MKLROOT)/lib/intel64/libmkl_intel_thread.a \
        $(MKLROOT)/lib/intel64/libmkl_core.a \
        -Wl,--end-group \
        -liomp5 \
        -lpthread
#
#
#
OBJ_EX32c = $(SRC_EX32c:.c=.o)
#
#
ex32s : $(OBJ_EX32c)
        $(CC) $(CFLAGS) -o $@ $(OBJ_EX32c) $(MKL_SQ_LIBS)
 
ex32m : $(OBJ_EX32c)
        $(CC) $(CFLAGS) -o $@ $(OBJ_EX32c) $(MKL_MT_LIBS)
 
#
#
%.o : %.c
        $(CC) $(CFLAGS) -c $<
 
#
clean:  
        rm -f *.o $(TARGET)

GDB

http://www.gnu.org/software/gdb/

To debug with GDB, submit an interactive job. See here

Compiling with -g option

 icc -g -pg -DDEBUG -c stokeslet2d.c

run gdb

user@host>gdb ./ex32s 
GNU gdb (GDB) Red Hat Enterprise Linux (7.2-56.el6)
Copyright (C) 2010 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /ccs-autofs/u01/fuji/LabWork/FlowInCavity/ex32s...done.
(gdb) 

show source by command, list line#

(gdb) list 44
39	    printf("Usage:%s [Depth of Cavity]\n",argv[0]);
40	    exit(-1);
41	  }
42	
43	  /* get inputed depth */
44	  dp = atof(argv[1]);
45	
46	  /* # of particles in depth */
47	  numpdepth = (int)(dp / EPSILON + 0.5);
48	  
(gdb)

set breakpoint by command, b line#

(gdb) b 47
Breakpoint 1 at 0x4044c2: file ex32.c, line 47.
(gdb) 

run [command line option]

(gdb) run 5
Starting program: /ccs-autofs/u01/fuji/LabWork/FlowInCavity/ex32s 1
[Thread debugging using libthread_db enabled]

Breakpoint 1, main (argc=2, argv=0x7fffffffd5c8) at ex32.c:47
47	  numpdepth = (int)(dp / EPSILON + 0.5);
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.3.x86_64
(gdb) 

print values

(gdb) p dp
$1 = 5
(gdb) p numpdepth
$2 = 0
(gdb) 

continue one step

(gdb) next
50	  numpwidth = (int)(1.0 / EPSILON + 0.5);
(gdb) p numpdepth
$3 = 1000
(gdb) 

exit

(gdb) quit
A debugging session is active.

	Inferior 1 [process 6222] will be killed.

Quit anyway? (y or n) y

Valgrind

http://valgrind.org/

Valgrind tools can detect many memory management and threading bugs, and profile your programs in detail.

Detect Invalid Access

Example code: (this code has a bug)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char * foo() {
  char a[200];
  strcpy(a, "hello world cup\n");
  return a;
}

int main() {
  char * a = foo();
  char c = a[0];
  printf("a[0] = %c\n", c);
  printf("a = %s\n", a);
  return 0;
}

Start an interactive session,

[fuji@cypress2 ~]$ idev -c 1 --gres=mic:0
Requesting 1 node(s)  task(s) to workshop queue of workshop partition
1 task(s)/node, 1 cpu(s)/task, mic:0 MIC device(s)/node
Time: 0 (hr) 60 (min).
Submitted batch job 52605
JOBID=52605 begin on cypress01-089
--> Creating interactive terminal session (login) on node cypress01-089.
--> You have 0 (hr) 60 (min).
Last login: Wed Aug 19 21:05:45 2015 from cypress2.cm.cluster
[fuji@cypress01-089 ~]$

compile and run,

[fuji@cypress01-089 ~]$ module load intel-psxe/2015-update1
[fuji@cypress01-089 ~]$ icc off_stack.c
off_stack.c(8): warning #1251: returning pointer to local variable
    return a;
           ^

[fuji@cypress01-089 ~]$ ./a.out
a[0] = h
a = hello world cup
[fuji@cypress01-089 ~]$ icc -O0 -g off_stack.c
off_stack.c(8): warning #1251: returning pointer to local variable
    return a;
           ^

[fuji@cypress01-089 ~]$ valgrind ./a.out
==33367== Memcheck, a memory error detector
==33367== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==33367== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==33367== Command: ./a.out
==33367==
==33367== Invalid read of size 1
==33367==    at 0x4005C5: main (off_stack.c:13)
==33367==  Address 0x7feffdd50 is just below the stack ptr.  To suppress, use: --workaround-gcc296-bugs=yes
==33367==
a[0] = h
a =
==33367==
==33367== HEAP SUMMARY:
==33367==     in use at exit: 0 bytes in 0 blocks
==33367==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==33367==
==33367== All heap blocks were freed -- no leaks are possible
==33367==
==33367== For counts of detected and suppressed errors, rerun with: -v
==33367== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6)
[fuji@cypress01-089 ~]$

Detect Uninitialized Data Access

Example code: (this code has a bug)

#include <stdio.h>
#include <stdlib.h>

int main() {
  double * p = malloc(sizeof(double) * 10);
  if (p[0] < 1) {
    printf("p[0] < 1\n");
  } else {
    printf("p[1] >= 1\n");
  }
  return 0;
}
[fuji@cypress01-089 Valgrind]$ icc uninit.c
[fuji@cypress01-089 Valgrind]$ ./a.out
p[0] < 1
[fuji@cypress01-089 Valgrind]$ icc -O0 -g uninit.c
[fuji@cypress01-089 Valgrind]$ valgrind ./a.out
==34643== Memcheck, a memory error detector
==34643== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==34643== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==34643== Command: ./a.out
==34643==
==34643== Conditional jump or move depends on uninitialised value(s)
==34643==    at 0x4005A9: main (uninit.c:6)
==34643==
==34643== Conditional jump or move depends on uninitialised value(s)
==34643==    at 0x4005AB: main (uninit.c:6)
==34643==
p[0] < 1
==34643==
==34643== HEAP SUMMARY:
==34643==     in use at exit: 80 bytes in 1 blocks
==34643==   total heap usage: 1 allocs, 0 frees, 80 bytes allocated
==34643==
==34643== LEAK SUMMARY:
==34643==    definitely lost: 80 bytes in 1 blocks
==34643==    indirectly lost: 0 bytes in 0 blocks
==34643==      possibly lost: 0 bytes in 0 blocks
==34643==    still reachable: 0 bytes in 0 blocks
==34643==         suppressed: 0 bytes in 0 blocks
==34643== Rerun with --leak-check=full to see details of leaked memory
==34643==
==34643== For counts of detected and suppressed errors, rerun with: -v
==34643== Use --track-origins=yes to see where uninitialised values come from
==34643== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 6 from 6)
Note: See TracWiki for help on using the wiki.