| 318 | |
| 319 | == Intel® Inspector XE == |
| 320 | Memory and Thread Debugger: |
| 321 | * Debug memory errors like leaks and allocation errors and threading errors like data races and deadlocks. |
| 322 | |
| 323 | ==== Setting Environment and Compiling your code ==== |
| 324 | Load module to setup Intel compilers and tools. |
| 325 | {{{#!bash |
| 326 | [fuji@cypress1 ~]$ module load intel-psxe/2015-update1 |
| 327 | }}} |
| 328 | Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. |
| 329 | {{{#!bash |
| 330 | [fuji@cypress1 ~]$ icc -g -o mytest mytest.c |
| 331 | }}} |
| 332 | |
| 333 | ==== Run and Collect Information ==== |
| 334 | Start an interactive job, |
| 335 | {{{#!bash |
| 336 | [fuji@cypress1 ~]$ idev |
| 337 | }}} |
| 338 | To collect information, run the code, for example, |
| 339 | {{{#!bash |
| 340 | [fuji@cypress1 ~]$ inspxe-cl -collect=mi2 -app-working-dir=$PWD -result-dir=$PWD/results $PWD/mytest |
| 341 | }}} |
| 342 | |
| 343 | '''-collect=''' options |
| 344 | |
| 345 | Memory error analysis types |
| 346 | ||= mi1 =|| Detect memory leaks || |
| 347 | ||= mi2 =|| Detect memory leaks and memory access problems || |
| 348 | ||= mi3 =|| Find locations of memory leaks and memory access problems || |
| 349 | |
| 350 | Threading error analysis_types |
| 351 | ||= ti1 =|| Detect deadlocks || |
| 352 | ||= ti2 =|| Detect deadlocks and data races || |
| 353 | ||= ti3 =|| Find locations of deadlocks and data races || |
| 354 | |
| 355 | To show results, for example, |
| 356 | {{{#!bash |
| 357 | [fuji@cypress1 ~]$ inspxe-cl -R problems -r $PWD/results |
| 358 | }}} |
| 359 | See [https://software.intel.com/en-us/node/528226 here] for details. |
| 360 | |
| 361 | [[Inspector Brief Tutorial]] |
| 362 | |
| 363 | == Intel® Advisor XE == |
| 364 | Threading design and prototyping tool for software architects: |
| 365 | * Analyze, design, tune and check your threading design before implementation |
| 366 | * Explore and test threading options without disrupting normal development |
| 367 | * Predict threading errors & performance scaling on systems with more cores |
| 368 | |
| 369 | === Survey === |
| 370 | Survey the application to determine hotspots. Typically an optimized |
| 371 | (non-debug) version of the application is used when surveying an application. |
| 372 | |
| 373 | Run and Collect info. |
| 374 | {{{#!bash |
| 375 | $ icc -g -O3 mycode.c |
| 376 | $ advixe-cl --collect survey --project-dir ./advi ./a.out |
| 377 | }}} |
| 378 | |
| 379 | Show report |
| 380 | {{{#!bash |
| 381 | $ advixe-cl --report survey --project-dir ./advi ./a.out |
| 382 | }}} |
| 383 | |
| 384 | === Add Annotations === |
| 385 | Add annotations to the application source code, and rebuild the application. |
| 386 | Please see the Getting Started Tutorial for more information. |
| 387 | |
| 388 | For C/C++ |
| 389 | {{{#!c |
| 390 | #include "advisor-annotate.h" |
| 391 | ..... |
| 392 | ANNOTATE_SITE_BEGIN(sitename1); |
| 393 | for ( .... |
| 394 | { |
| 395 | ANNOTATE_TASK_BEGIN(taskname1); |
| 396 | ... |
| 397 | ANNOTATE_TASK_END(); |
| 398 | } |
| 399 | ANNOTATE_SITE_END(); |
| 400 | }}} |
| 401 | |
| 402 | Fortran |
| 403 | {{{#!fortran |
| 404 | use advisor_annotate |
| 405 | ..... |
| 406 | call annotate_site_begin(sitename1) |
| 407 | do ..... |
| 408 | call annotate_task_begin(taskname1) |
| 409 | .... |
| 410 | call annotate_task_end() |
| 411 | enddo |
| 412 | call annotate_site_end() |
| 413 | }}} |
| 414 | |
| 415 | === Suitability === |
| 416 | Collect suitability data. Note that annotations must be present in the source |
| 417 | code for this collection to be successful. Typically an optimized (non-debug) version |
| 418 | of the application is used when collecting suitability data. |
| 419 | |
| 420 | {{{#!bash |
| 421 | $ icc -g -O3 mycode.c -I $ADVISOR_XE_2015_DIR/include |
| 422 | $ advixe-cl --collect suitability --project-dir ./advi ./a.out |
| 423 | }}} |
| 424 | |
| 425 | {{{#!bash |
| 426 | $ advixe-cl --report suitability --project-dir ./advi ./a.out |
| 427 | }}} |
| 428 | |
| 429 | |
| 430 | === Correctness === |
| 431 | Collect correctness data. Note that annotations must be present in the source |
| 432 | code for this collection to be successful. Typically an application with debug symbols |
| 433 | is used when collecting correctness data. |
| 434 | |
| 435 | {{{#!bash |
| 436 | $ icc -g -O0 mycode.c |
| 437 | $ advixe-cl --collect correctness --project-dir ./advi ./a.out |
| 438 | }}} |
| 439 | |
| 440 | {{{#!bash |
| 441 | $ advixe-cl --report correctness --project-dir ./advi ./a.out |
| 442 | }}} |
| 443 | |
| 444 | Display a list of annotations present. |
| 445 | {{{#!bash |
| 446 | advixe-cl --report annotations --project-dir ./advi ./a.out |
| 447 | }}} |
| 448 | Update the application using the chosen parallel coding constructs. Rebuild the application and test. |
| 449 | |
| 450 | [[Advisor Brief Tutorial]] |
| 451 | |
| 452 | == Intel® VTune™ Amplifier 2015 == |
| 453 | * Intuitive CPU & GPU performance tuning, multi-core scalability, bandwidth and more |
| 454 | * Quick performance insight with advanced data visualization |
| 455 | * Automate regression tests and collect data remotely |
| 456 | |
| 457 | Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. |
| 458 | {{{#!bash |
| 459 | [fuji@cypress1 ~]$ icc -g -o mytest mytest.c |
| 460 | }}} |
| 461 | |
| 462 | ==== Run and Collect Information ==== |
| 463 | Start an interactive job, |
| 464 | {{{#!bash |
| 465 | [fuji@cypress1 ~]$ idev |
| 466 | }}} |
| 467 | To collect information, run the code, for example, |
| 468 | {{{#!bash |
| 469 | [fuji@cypress1 ~]$ amplxe-cl -collect hotspot ./mytest |
| 470 | }}} |
| 471 | This will create a directory like '''r000hs'''. |
| 472 | |
| 473 | '''-collect ''' options |
| 474 | |
| 475 | ||= concurrency =|| Concurrency analysis || |
| 476 | ||= hotspots =|| Hotspots analysis || |
| 477 | ||= lightweight-hotspots =|| Lightweight Hotspots analysis || |
| 478 | ||= locksandwaits =|| Locks and Waits analysis || |
| 479 | |
| 480 | To show results, for example, |
| 481 | {{{#!bash |
| 482 | [fuji@cypress1 ~]$ amplxe-cl -report hotspot -r r000hs |
| 483 | }}} |
| 484 | |
| 485 | '''-report ''' options |
| 486 | |
| 487 | ||= summary =|| Display data for the overall performance of the target. || |
| 488 | ||= hotspots =|| Display functions with the highest CPU time. || |
| 489 | ||= wait-time =|| Display Wait time. || |
| 490 | ||= perf =|| Display performance data for each module of the target. || |
| 491 | ||= perf-detail =|| Display performance data for each function of the target. || |
| 492 | ||= callstacks =|| Display CPU or Wait time for call stacks. || |
| 493 | ||= top-down =|| Display a call tree for your target application and provide CPU and Wait time for each function. || |
| 494 | ||= gprof-cc =|| Display CPU or wait time in the gprof-like format. || |
| 495 | |
| 496 | [[VTune Brief Tutorial]] |