| 337 | | |
| 338 | | |
| 339 | | |
| 340 | | --------------------------------------------------------- |
| 341 | | = Code Debugging / Profiling = |
| 342 | | |
| 343 | | Users are encouraged to debug their codes on a local machine. |
| 344 | | |
| 345 | | GDB is the standard debugger. |
| 346 | | |
| 347 | | [http://www.gnu.org/software/gdb/documentation/] |
| 348 | | |
| 349 | | For profiler, |
| 350 | | |
| 351 | | [https://sourceware.org/binutils/docs/gprof/] |
| 352 | | |
| 353 | | 'valgrind' can detect memory management and threading bugs. |
| 354 | | |
| 355 | | [http://valgrind.org/docs/manual/index.html] |
| 356 | | |
| 357 | | All tools above are available on cypress. |
| 358 | | |
| 359 | | Also there are powerful Intel products installed on cypress. |
| 360 | | |
| 361 | | == Intel® Inspector XE == |
| 362 | | Memory and Thread Debugger: |
| 363 | | * Debug memory errors like leaks and allocation errors and threading errors like data races and deadlocks. |
| 364 | | |
| 365 | | ==== Setting Environment and Compiling your code ==== |
| 366 | | Load module to setup Intel compilers and tools. |
| 367 | | {{{#!bash |
| 368 | | [fuji@cypress1 ~]$ module load intel-psxe/2015-update1 |
| 369 | | }}} |
| 370 | | Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. |
| 371 | | {{{#!bash |
| 372 | | [fuji@cypress1 ~]$ icc -g -o mytest mytest.c |
| 373 | | }}} |
| 374 | | |
| 375 | | ==== Run and Collect Information ==== |
| 376 | | Start an interactive job, |
| 377 | | {{{#!bash |
| 378 | | [fuji@cypress1 ~]$ idev |
| 379 | | }}} |
| 380 | | To collect information, run the code, for example, |
| 381 | | {{{#!bash |
| 382 | | [fuji@cypress1 ~]$ inspxe-cl -collect=mi2 -app-working-dir=$PWD -result-dir=$PWD/results $PWD/mytest |
| 383 | | }}} |
| 384 | | |
| 385 | | '''-collect=''' options |
| 386 | | |
| 387 | | Memory error analysis types |
| 388 | | ||= mi1 =|| Detect memory leaks || |
| 389 | | ||= mi2 =|| Detect memory leaks and memory access problems || |
| 390 | | ||= mi3 =|| Find locations of memory leaks and memory access problems || |
| 391 | | |
| 392 | | Threading error analysis_types |
| 393 | | ||= ti1 =|| Detect deadlocks || |
| 394 | | ||= ti2 =|| Detect deadlocks and data races || |
| 395 | | ||= ti3 =|| Find locations of deadlocks and data races || |
| 396 | | |
| 397 | | To show results, for example, |
| 398 | | {{{#!bash |
| 399 | | [fuji@cypress1 ~]$ inspxe-cl -R problems -r $PWD/results |
| 400 | | }}} |
| 401 | | See [https://software.intel.com/en-us/node/528226 here] for details. |
| 402 | | |
| 403 | | [[Inspector Brief Tutorial]] |
| 404 | | |
| 405 | | == Intel® Advisor XE == |
| 406 | | Threading design and prototyping tool for software architects: |
| 407 | | * Analyze, design, tune and check your threading design before implementation |
| 408 | | * Explore and test threading options without disrupting normal development |
| 409 | | * Predict threading errors & performance scaling on systems with more cores |
| 410 | | |
| 411 | | === Survey === |
| 412 | | Survey the application to determine hotspots. Typically an optimized |
| 413 | | (non-debug) version of the application is used when surveying an application. |
| 414 | | |
| 415 | | Run and Collect info. |
| 416 | | {{{#!bash |
| 417 | | $ icc -g -O3 mycode.c |
| 418 | | $ advixe-cl --collect survey --project-dir ./advi ./a.out |
| 419 | | }}} |
| 420 | | |
| 421 | | Show report |
| 422 | | {{{#!bash |
| 423 | | $ advixe-cl --report survey --project-dir ./advi ./a.out |
| 424 | | }}} |
| 425 | | |
| 426 | | === Add Annotations === |
| 427 | | Add annotations to the application source code, and rebuild the application. |
| 428 | | Please see the Getting Started Tutorial for more information. |
| 429 | | |
| 430 | | For C/C++ |
| 431 | | {{{#!c |
| 432 | | #include "advisor-annotate.h" |
| 433 | | ..... |
| 434 | | ANNOTATE_SITE_BEGIN(sitename1); |
| 435 | | for ( .... |
| 436 | | { |
| 437 | | ANNOTATE_TASK_BEGIN(taskname1); |
| 438 | | ... |
| 439 | | ANNOTATE_TASK_END(); |
| 440 | | } |
| 441 | | ANNOTATE_SITE_END(); |
| 442 | | }}} |
| 443 | | |
| 444 | | Fortran |
| 445 | | {{{#!fortran |
| 446 | | use advisor_annotate |
| 447 | | ..... |
| 448 | | call annotate_site_begin(sitename1) |
| 449 | | do ..... |
| 450 | | call annotate_task_begin(taskname1) |
| 451 | | .... |
| 452 | | call annotate_task_end() |
| 453 | | enddo |
| 454 | | call annotate_site_end() |
| 455 | | }}} |
| 456 | | |
| 457 | | === Suitability === |
| 458 | | Collect suitability data. Note that annotations must be present in the source |
| 459 | | code for this collection to be successful. Typically an optimized (non-debug) version |
| 460 | | of the application is used when collecting suitability data. |
| 461 | | |
| 462 | | {{{#!bash |
| 463 | | $ icc -g -O3 mycode.c -I $ADVISOR_XE_2015_DIR/include |
| 464 | | $ advixe-cl --collect suitability --project-dir ./advi ./a.out |
| 465 | | }}} |
| 466 | | |
| 467 | | {{{#!bash |
| 468 | | $ advixe-cl --report suitability --project-dir ./advi ./a.out |
| 469 | | }}} |
| 470 | | |
| 471 | | |
| 472 | | === Correctness === |
| 473 | | Collect correctness data. Note that annotations must be present in the source |
| 474 | | code for this collection to be successful. Typically an application with debug symbols |
| 475 | | is used when collecting correctness data. |
| 476 | | |
| 477 | | {{{#!bash |
| 478 | | $ icc -g -O0 mycode.c |
| 479 | | $ advixe-cl --collect correctness --project-dir ./advi ./a.out |
| 480 | | }}} |
| 481 | | |
| 482 | | {{{#!bash |
| 483 | | $ advixe-cl --report correctness --project-dir ./advi ./a.out |
| 484 | | }}} |
| 485 | | |
| 486 | | Display a list of annotations present. |
| 487 | | {{{#!bash |
| 488 | | advixe-cl --report annotations --project-dir ./advi ./a.out |
| 489 | | }}} |
| 490 | | Update the application using the chosen parallel coding constructs. Rebuild the application and test. |
| 491 | | |
| 492 | | [[Advisor Brief Tutorial]] |
| 493 | | |
| 494 | | == Intel® VTune™ Amplifier 2015 == |
| 495 | | * Intuitive CPU & GPU performance tuning, multi-core scalability, bandwidth and more |
| 496 | | * Quick performance insight with advanced data visualization |
| 497 | | * Automate regression tests and collect data remotely |
| 498 | | |
| 499 | | Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. |
| 500 | | {{{#!bash |
| 501 | | [fuji@cypress1 ~]$ icc -g -o mytest mytest.c |
| 502 | | }}} |
| 503 | | |
| 504 | | ==== Run and Collect Information ==== |
| 505 | | Start an interactive job, |
| 506 | | {{{#!bash |
| 507 | | [fuji@cypress1 ~]$ idev |
| 508 | | }}} |
| 509 | | To collect information, run the code, for example, |
| 510 | | {{{#!bash |
| 511 | | [fuji@cypress1 ~]$ amplxe-cl -collect hotspot ./mytest |
| 512 | | }}} |
| 513 | | This will create a directory like '''r000hs'''. |
| 514 | | |
| 515 | | '''-collect ''' options |
| 516 | | |
| 517 | | ||= concurrency =|| Concurrency analysis || |
| 518 | | ||= hotspots =|| Hotspots analysis || |
| 519 | | ||= lightweight-hotspots =|| Lightweight Hotspots analysis || |
| 520 | | ||= locksandwaits =|| Locks and Waits analysis || |
| 521 | | |
| 522 | | To show results, for example, |
| 523 | | {{{#!bash |
| 524 | | [fuji@cypress1 ~]$ amplxe-cl -report hotspot -r r000hs |
| 525 | | }}} |
| 526 | | |
| 527 | | '''-report ''' options |
| 528 | | |
| 529 | | ||= summary =|| Display data for the overall performance of the target. || |
| 530 | | ||= hotspots =|| Display functions with the highest CPU time. || |
| 531 | | ||= wait-time =|| Display Wait time. || |
| 532 | | ||= perf =|| Display performance data for each module of the target. || |
| 533 | | ||= perf-detail =|| Display performance data for each function of the target. || |
| 534 | | ||= callstacks =|| Display CPU or Wait time for call stacks. || |
| 535 | | ||= top-down =|| Display a call tree for your target application and provide CPU and Wait time for each function. || |
| 536 | | ||= gprof-cc =|| Display CPU or wait time in the gprof-like format. || |
| 537 | | |
| 538 | | [[VTune Brief Tutorial]] |