| | 318 | |
| | 319 | == Intel® Inspector XE == |
| | 320 | Memory and Thread Debugger: |
| | 321 | * Debug memory errors like leaks and allocation errors and threading errors like data races and deadlocks. |
| | 322 | |
| | 323 | ==== Setting Environment and Compiling your code ==== |
| | 324 | Load module to setup Intel compilers and tools. |
| | 325 | {{{#!bash |
| | 326 | [fuji@cypress1 ~]$ module load intel-psxe/2015-update1 |
| | 327 | }}} |
| | 328 | Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. |
| | 329 | {{{#!bash |
| | 330 | [fuji@cypress1 ~]$ icc -g -o mytest mytest.c |
| | 331 | }}} |
| | 332 | |
| | 333 | ==== Run and Collect Information ==== |
| | 334 | Start an interactive job, |
| | 335 | {{{#!bash |
| | 336 | [fuji@cypress1 ~]$ idev |
| | 337 | }}} |
| | 338 | To collect information, run the code, for example, |
| | 339 | {{{#!bash |
| | 340 | [fuji@cypress1 ~]$ inspxe-cl -collect=mi2 -app-working-dir=$PWD -result-dir=$PWD/results $PWD/mytest |
| | 341 | }}} |
| | 342 | |
| | 343 | '''-collect=''' options |
| | 344 | |
| | 345 | Memory error analysis types |
| | 346 | ||= mi1 =|| Detect memory leaks || |
| | 347 | ||= mi2 =|| Detect memory leaks and memory access problems || |
| | 348 | ||= mi3 =|| Find locations of memory leaks and memory access problems || |
| | 349 | |
| | 350 | Threading error analysis_types |
| | 351 | ||= ti1 =|| Detect deadlocks || |
| | 352 | ||= ti2 =|| Detect deadlocks and data races || |
| | 353 | ||= ti3 =|| Find locations of deadlocks and data races || |
| | 354 | |
| | 355 | To show results, for example, |
| | 356 | {{{#!bash |
| | 357 | [fuji@cypress1 ~]$ inspxe-cl -R problems -r $PWD/results |
| | 358 | }}} |
| | 359 | See [https://software.intel.com/en-us/node/528226 here] for details. |
| | 360 | |
| | 361 | [[Inspector Brief Tutorial]] |
| | 362 | |
| | 363 | == Intel® Advisor XE == |
| | 364 | Threading design and prototyping tool for software architects: |
| | 365 | * Analyze, design, tune and check your threading design before implementation |
| | 366 | * Explore and test threading options without disrupting normal development |
| | 367 | * Predict threading errors & performance scaling on systems with more cores |
| | 368 | |
| | 369 | === Survey === |
| | 370 | Survey the application to determine hotspots. Typically an optimized |
| | 371 | (non-debug) version of the application is used when surveying an application. |
| | 372 | |
| | 373 | Run and Collect info. |
| | 374 | {{{#!bash |
| | 375 | $ icc -g -O3 mycode.c |
| | 376 | $ advixe-cl --collect survey --project-dir ./advi ./a.out |
| | 377 | }}} |
| | 378 | |
| | 379 | Show report |
| | 380 | {{{#!bash |
| | 381 | $ advixe-cl --report survey --project-dir ./advi ./a.out |
| | 382 | }}} |
| | 383 | |
| | 384 | === Add Annotations === |
| | 385 | Add annotations to the application source code, and rebuild the application. |
| | 386 | Please see the Getting Started Tutorial for more information. |
| | 387 | |
| | 388 | For C/C++ |
| | 389 | {{{#!c |
| | 390 | #include "advisor-annotate.h" |
| | 391 | ..... |
| | 392 | ANNOTATE_SITE_BEGIN(sitename1); |
| | 393 | for ( .... |
| | 394 | { |
| | 395 | ANNOTATE_TASK_BEGIN(taskname1); |
| | 396 | ... |
| | 397 | ANNOTATE_TASK_END(); |
| | 398 | } |
| | 399 | ANNOTATE_SITE_END(); |
| | 400 | }}} |
| | 401 | |
| | 402 | Fortran |
| | 403 | {{{#!fortran |
| | 404 | use advisor_annotate |
| | 405 | ..... |
| | 406 | call annotate_site_begin(sitename1) |
| | 407 | do ..... |
| | 408 | call annotate_task_begin(taskname1) |
| | 409 | .... |
| | 410 | call annotate_task_end() |
| | 411 | enddo |
| | 412 | call annotate_site_end() |
| | 413 | }}} |
| | 414 | |
| | 415 | === Suitability === |
| | 416 | Collect suitability data. Note that annotations must be present in the source |
| | 417 | code for this collection to be successful. Typically an optimized (non-debug) version |
| | 418 | of the application is used when collecting suitability data. |
| | 419 | |
| | 420 | {{{#!bash |
| | 421 | $ icc -g -O3 mycode.c -I $ADVISOR_XE_2015_DIR/include |
| | 422 | $ advixe-cl --collect suitability --project-dir ./advi ./a.out |
| | 423 | }}} |
| | 424 | |
| | 425 | {{{#!bash |
| | 426 | $ advixe-cl --report suitability --project-dir ./advi ./a.out |
| | 427 | }}} |
| | 428 | |
| | 429 | |
| | 430 | === Correctness === |
| | 431 | Collect correctness data. Note that annotations must be present in the source |
| | 432 | code for this collection to be successful. Typically an application with debug symbols |
| | 433 | is used when collecting correctness data. |
| | 434 | |
| | 435 | {{{#!bash |
| | 436 | $ icc -g -O0 mycode.c |
| | 437 | $ advixe-cl --collect correctness --project-dir ./advi ./a.out |
| | 438 | }}} |
| | 439 | |
| | 440 | {{{#!bash |
| | 441 | $ advixe-cl --report correctness --project-dir ./advi ./a.out |
| | 442 | }}} |
| | 443 | |
| | 444 | Display a list of annotations present. |
| | 445 | {{{#!bash |
| | 446 | advixe-cl --report annotations --project-dir ./advi ./a.out |
| | 447 | }}} |
| | 448 | Update the application using the chosen parallel coding constructs. Rebuild the application and test. |
| | 449 | |
| | 450 | [[Advisor Brief Tutorial]] |
| | 451 | |
| | 452 | == Intel® VTune™ Amplifier 2015 == |
| | 453 | * Intuitive CPU & GPU performance tuning, multi-core scalability, bandwidth and more |
| | 454 | * Quick performance insight with advanced data visualization |
| | 455 | * Automate regression tests and collect data remotely |
| | 456 | |
| | 457 | Compiling codes with '-g' option to tells the compiler to generate full debugging information in the object file. |
| | 458 | {{{#!bash |
| | 459 | [fuji@cypress1 ~]$ icc -g -o mytest mytest.c |
| | 460 | }}} |
| | 461 | |
| | 462 | ==== Run and Collect Information ==== |
| | 463 | Start an interactive job, |
| | 464 | {{{#!bash |
| | 465 | [fuji@cypress1 ~]$ idev |
| | 466 | }}} |
| | 467 | To collect information, run the code, for example, |
| | 468 | {{{#!bash |
| | 469 | [fuji@cypress1 ~]$ amplxe-cl -collect hotspot ./mytest |
| | 470 | }}} |
| | 471 | This will create a directory like '''r000hs'''. |
| | 472 | |
| | 473 | '''-collect ''' options |
| | 474 | |
| | 475 | ||= concurrency =|| Concurrency analysis || |
| | 476 | ||= hotspots =|| Hotspots analysis || |
| | 477 | ||= lightweight-hotspots =|| Lightweight Hotspots analysis || |
| | 478 | ||= locksandwaits =|| Locks and Waits analysis || |
| | 479 | |
| | 480 | To show results, for example, |
| | 481 | {{{#!bash |
| | 482 | [fuji@cypress1 ~]$ amplxe-cl -report hotspot -r r000hs |
| | 483 | }}} |
| | 484 | |
| | 485 | '''-report ''' options |
| | 486 | |
| | 487 | ||= summary =|| Display data for the overall performance of the target. || |
| | 488 | ||= hotspots =|| Display functions with the highest CPU time. || |
| | 489 | ||= wait-time =|| Display Wait time. || |
| | 490 | ||= perf =|| Display performance data for each module of the target. || |
| | 491 | ||= perf-detail =|| Display performance data for each function of the target. || |
| | 492 | ||= callstacks =|| Display CPU or Wait time for call stacks. || |
| | 493 | ||= top-down =|| Display a call tree for your target application and provide CPU and Wait time for each function. || |
| | 494 | ||= gprof-cc =|| Display CPU or wait time in the gprof-like format. || |
| | 495 | |
| | 496 | [[VTune Brief Tutorial]] |