337 | | |
338 | | |
339 | | |
340 | | --------------------------------------------------------- |
341 | | = Code Debugging / Profiling = |
342 | | |
343 | | Users are encouraged to debug their codes on a local machine. |
344 | | |
345 | | GDB is the standard debugger. |
346 | | |
347 | | [http://www.gnu.org/software/gdb/documentation/] |
348 | | |
349 | | For profiler, |
350 | | |
351 | | [https://sourceware.org/binutils/docs/gprof/] |
352 | | |
353 | | 'valgrind' can detect memory management and threading bugs. |
354 | | |
355 | | [http://valgrind.org/docs/manual/index.html] |
356 | | |
357 | | All tools above are available on cypress. |
358 | | |
359 | | Also there are powerful Intel products installed on cypress. |
360 | | |
361 | | == Intel® Inspector XE == |
362 | | Memory and Thread Debugger: |
363 | | * Debug memory errors like leaks and allocation errors and threading errors like data races and deadlocks. |
364 | | |
365 | | ==== Setting Environment and Compiling your code ==== |
366 | | Load module to setup Intel compilers and tools. |
367 | | {{{#!bash |
368 | | [fuji@cypress1 ~]$ module load intel-psxe/2015-update1 |
369 | | }}} |
370 | | Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. |
371 | | {{{#!bash |
372 | | [fuji@cypress1 ~]$ icc -g -o mytest mytest.c |
373 | | }}} |
374 | | |
375 | | ==== Run and Collect Information ==== |
376 | | Start an interactive job, |
377 | | {{{#!bash |
378 | | [fuji@cypress1 ~]$ idev |
379 | | }}} |
380 | | To collect information, run the code, for example, |
381 | | {{{#!bash |
382 | | [fuji@cypress1 ~]$ inspxe-cl -collect=mi2 -app-working-dir=$PWD -result-dir=$PWD/results $PWD/mytest |
383 | | }}} |
384 | | |
385 | | '''-collect=''' options |
386 | | |
387 | | Memory error analysis types |
388 | | ||= mi1 =|| Detect memory leaks || |
389 | | ||= mi2 =|| Detect memory leaks and memory access problems || |
390 | | ||= mi3 =|| Find locations of memory leaks and memory access problems || |
391 | | |
392 | | Threading error analysis_types |
393 | | ||= ti1 =|| Detect deadlocks || |
394 | | ||= ti2 =|| Detect deadlocks and data races || |
395 | | ||= ti3 =|| Find locations of deadlocks and data races || |
396 | | |
397 | | To show results, for example, |
398 | | {{{#!bash |
399 | | [fuji@cypress1 ~]$ inspxe-cl -R problems -r $PWD/results |
400 | | }}} |
401 | | See [https://software.intel.com/en-us/node/528226 here] for details. |
402 | | |
403 | | [[Inspector Brief Tutorial]] |
404 | | |
405 | | == Intel® Advisor XE == |
406 | | Threading design and prototyping tool for software architects: |
407 | | * Analyze, design, tune and check your threading design before implementation |
408 | | * Explore and test threading options without disrupting normal development |
409 | | * Predict threading errors & performance scaling on systems with more cores |
410 | | |
411 | | === Survey === |
412 | | Survey the application to determine hotspots. Typically an optimized |
413 | | (non-debug) version of the application is used when surveying an application. |
414 | | |
415 | | Run and Collect info. |
416 | | {{{#!bash |
417 | | $ icc -g -O3 mycode.c |
418 | | $ advixe-cl --collect survey --project-dir ./advi ./a.out |
419 | | }}} |
420 | | |
421 | | Show report |
422 | | {{{#!bash |
423 | | $ advixe-cl --report survey --project-dir ./advi ./a.out |
424 | | }}} |
425 | | |
426 | | === Add Annotations === |
427 | | Add annotations to the application source code, and rebuild the application. |
428 | | Please see the Getting Started Tutorial for more information. |
429 | | |
430 | | For C/C++ |
431 | | {{{#!c |
432 | | #include "advisor-annotate.h" |
433 | | ..... |
434 | | ANNOTATE_SITE_BEGIN(sitename1); |
435 | | for ( .... |
436 | | { |
437 | | ANNOTATE_TASK_BEGIN(taskname1); |
438 | | ... |
439 | | ANNOTATE_TASK_END(); |
440 | | } |
441 | | ANNOTATE_SITE_END(); |
442 | | }}} |
443 | | |
444 | | Fortran |
445 | | {{{#!fortran |
446 | | use advisor_annotate |
447 | | ..... |
448 | | call annotate_site_begin(sitename1) |
449 | | do ..... |
450 | | call annotate_task_begin(taskname1) |
451 | | .... |
452 | | call annotate_task_end() |
453 | | enddo |
454 | | call annotate_site_end() |
455 | | }}} |
456 | | |
457 | | === Suitability === |
458 | | Collect suitability data. Note that annotations must be present in the source |
459 | | code for this collection to be successful. Typically an optimized (non-debug) version |
460 | | of the application is used when collecting suitability data. |
461 | | |
462 | | {{{#!bash |
463 | | $ icc -g -O3 mycode.c -I $ADVISOR_XE_2015_DIR/include |
464 | | $ advixe-cl --collect suitability --project-dir ./advi ./a.out |
465 | | }}} |
466 | | |
467 | | {{{#!bash |
468 | | $ advixe-cl --report suitability --project-dir ./advi ./a.out |
469 | | }}} |
470 | | |
471 | | |
472 | | === Correctness === |
473 | | Collect correctness data. Note that annotations must be present in the source |
474 | | code for this collection to be successful. Typically an application with debug symbols |
475 | | is used when collecting correctness data. |
476 | | |
477 | | {{{#!bash |
478 | | $ icc -g -O0 mycode.c |
479 | | $ advixe-cl --collect correctness --project-dir ./advi ./a.out |
480 | | }}} |
481 | | |
482 | | {{{#!bash |
483 | | $ advixe-cl --report correctness --project-dir ./advi ./a.out |
484 | | }}} |
485 | | |
486 | | Display a list of annotations present. |
487 | | {{{#!bash |
488 | | advixe-cl --report annotations --project-dir ./advi ./a.out |
489 | | }}} |
490 | | Update the application using the chosen parallel coding constructs. Rebuild the application and test. |
491 | | |
492 | | [[Advisor Brief Tutorial]] |
493 | | |
494 | | == Intel® VTune™ Amplifier 2015 == |
495 | | * Intuitive CPU & GPU performance tuning, multi-core scalability, bandwidth and more |
496 | | * Quick performance insight with advanced data visualization |
497 | | * Automate regression tests and collect data remotely |
498 | | |
499 | | Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. |
500 | | {{{#!bash |
501 | | [fuji@cypress1 ~]$ icc -g -o mytest mytest.c |
502 | | }}} |
503 | | |
504 | | ==== Run and Collect Information ==== |
505 | | Start an interactive job, |
506 | | {{{#!bash |
507 | | [fuji@cypress1 ~]$ idev |
508 | | }}} |
509 | | To collect information, run the code, for example, |
510 | | {{{#!bash |
511 | | [fuji@cypress1 ~]$ amplxe-cl -collect hotspot ./mytest |
512 | | }}} |
513 | | This will create a directory like '''r000hs'''. |
514 | | |
515 | | '''-collect ''' options |
516 | | |
517 | | ||= concurrency =|| Concurrency analysis || |
518 | | ||= hotspots =|| Hotspots analysis || |
519 | | ||= lightweight-hotspots =|| Lightweight Hotspots analysis || |
520 | | ||= locksandwaits =|| Locks and Waits analysis || |
521 | | |
522 | | To show results, for example, |
523 | | {{{#!bash |
524 | | [fuji@cypress1 ~]$ amplxe-cl -report hotspot -r r000hs |
525 | | }}} |
526 | | |
527 | | '''-report ''' options |
528 | | |
529 | | ||= summary =|| Display data for the overall performance of the target. || |
530 | | ||= hotspots =|| Display functions with the highest CPU time. || |
531 | | ||= wait-time =|| Display Wait time. || |
532 | | ||= perf =|| Display performance data for each module of the target. || |
533 | | ||= perf-detail =|| Display performance data for each function of the target. || |
534 | | ||= callstacks =|| Display CPU or Wait time for call stacks. || |
535 | | ||= top-down =|| Display a call tree for your target application and provide CPU and Wait time for each function. || |
536 | | ||= gprof-cc =|| Display CPU or wait time in the gprof-like format. || |
537 | | |
538 | | [[VTune Brief Tutorial]] |