Intel 发现并行的问题

            

 

 

 

 

 

Lab 3: Finding Parallelism Issues

___________________________________________________________________

 

Developer Product Division

 

 

 





                                                                                                        

 

 

 

                                        

Disclaimer

The information contained in this document is provided for informational purposes only and represents the current view of Intel Corporation ("Intel") and its contributors ("Contributors") on, as of the date of publication. Intel and the Contributors make no commitment to update the information contained in this document, and Intel reserves the right to make changes at any time, without notice.

DISCLAIMER. THIS DOCUMENT, IS PROVIDED "AS IS." NEITHER INTEL, NOR THE CONTRIBUTORS MAKE ANY REPRESENTATIONS OF ANY KIND WITH RESPECT TO PRODUCTS REFERENCED HEREIN, WHETHER SUCH PRODUCTS ARE THOSE OF INTEL, THE CONTRIBUTORS, OR THIRD PARTIES. INTEL, AND ITS CONTRIBUTORS EXPRESSLY DISCLAIM ANY AND ALL WARRANTIES, IMPLIED OR EXPRESS, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR ANY PARTICULAR PURPOSE, NON-INFRINGEMENT, AND ANY WARRANTY ARISING OUT OF THE INFORMATION CONTAINED HEREIN, INCLUDING WITHOUT LIMITATION, ANY PRODUCTS, SPECIFICATIONS, OR OTHER MATERIALS REFERENCED HEREIN. INTEL, AND ITS CONTRIBUTORS DO NOT WARRANT THAT THIS DOCUMENT IS FREE FROM ERRORS, OR THAT ANY PRODUCTS OR OTHER TECHNOLOGY DEVELOPED IN CONFORMANCE WITH THIS DOCUMENT WILL PERFORM IN THE INTENDED MANNER, OR WILL BE FREE FROM INFRINGEMENT OF THIRD PARTY PROPRIETARY RIGHTS, AND INTEL, AND ITS CONTRIBUTORS DISCLAIM ALL LIABILITY THEREFOR. INTEL, AND ITS CONTRIBUTORS DO NOT WARRANT THAT ANY PRODUCT REFERENCED HEREIN OR ANY PRODUCT OR TECHNOLOGY DEVELOPED IN RELIANCE UPON THIS DOCUMENT, IN WHOLE OR IN PART, WILL BE SUFFICIENT, ACCURATE, RELIABLE, COMPLETE, FREE FROM DEFECTS OR SAFE FOR ITS INTENDED PURPOSE, AND HEREBY DISCLAIM ALL LIABILITIES THEREFOR. ANY PERSON MAKING, USING OR SELLING SUCH PRODUCT OR TECHNOLOGY DOES SO AT HIS OR HER OWN RISK.

Licenses may be required. Intel, its contributors and others may have patents or pending patent applications, trademarks, copyrights or other intellectual proprietary rights covering subject matter contained or described in this document. No license, express, implied, by estoppels or otherwise, to any intellectual property rights of Intel or any other party is granted herein. It is your responsibility to seek licenses for such intellectual property rights from Intel and others where appropriate. Limited License Grant. Intel hereby grants you a limited copyright license to copy this document for your use and internal distribution only. You may not distribute this document externally, in whole or in part, to any other person or entity. LIMITED LIABILITY. IN NO EVENT SHALL INTEL, OR ITS CONTRIBUTORS HAVE ANY LIABILITY TO YOU OR TO ANY OTHER THIRD PARTY, FOR ANY LOST PROFITS, LOST DATA, LOSS OF USE OR COSTS OF PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES, OR FOR ANY DIRECT, INDIRECT, SPECIAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF YOUR USE OF THIS DOCUMENT OR RELIANCE UPON THE INFORMATION CONTAINED HEREIN, UNDER ANY CAUSE OF ACTION OR THEORY OF LIABILITY, AND IRRESPECTIVE OF WHETHER INTEL, OR ANY CONTRIBUTOR HAS ADVANCE NOTICE OF THE POSSIBILITY OF SUCH DAMAGES. THESE LIMITATIONS SHALL APPLY NOTWITHSTANDING THE FAILURE OF THE ESSENTIAL PURPOSE OF ANY LIMITED REMEDY.

Intel and Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

*Other names and brands may be claimed as the property of others.

Copyright © 2010, Intel Corporation. All Rights Reserved.Table of Contents

Lab 3: Finding Parallelism Issues    i

Developer Product Division    i

Disclaimer    ii

Lab 3: Finding Parallelism Issues    1

Activity 1 – Collect Locks-And-Waits Data    2

Activity 2 – Find Causes of Poor Parallelism    3

 

 

 

Lab 3: Finding Parallelism Issues

Time Required

Thirty minutes

Objective

In this lab session, you will use Intel® VTune™ Amplifier XE to determine the synchronization objects or APIs that are limiting parallelism in an application.

After successfully completing this lab's activities, you will be able to:

  • Collect parallelism performance data for an application
  • Determine the most significant causes of blocked time in an application

 

Activity 1 – Collect Locks-And-Waits Data

Time Required

Ten minutes

Objective

  • Run the application while collecting blocked time data
  

 

  1. Right-click on analyzing_locks in the Solution Explorer window and select "Set As Startup Project"
  2. Click on the "New Analysis" button
  3. Select "Algorithm Analysis->Locks And Waits" in the analysis type pane
  4. Click "Start" – The tachyon application will run. Note that as the application runs it draws and image of several different silver balls on the screen. Make a note of the execution time displayed in the application window as before.
  5. After the application completes the Intel® VTune™ Amplifier XE will spend some time analyzing the data. When it is finished analyzing, the summary pane for Locks And Waits appears. Note the analysis explanation pane comes up. Read it and then clear the pane.

    At this point the application has run to completion and the Intel® VTune™ Analyzer is ready to display the analyzed results.

 

 

Review Questions

 

 

 

 

Activity 2 – Find Causes of Poor Parallelism

Time Required

Ten minutes

Objective

  • Use the Intel® VTune™ Amplifier XE to find a cause of poor parallelism

Codes Description 

  • Tachyon is a 2-D raytracer/rendering program that displays an image

 

  1. Click "Bottom-Up" tab. Notice at or near the top is a Sync Object labeled "critical section 0xnnnnnnnn". This is a critical section that is in the user code and is limiting parallelism as shown by the large amount of wait time indicated and the amount of "Poor" time (fewer than the available number of CPUs used) shown.
  2. Click on the arrow to the left of the term "Critical Section" to expand the list of callers to that Critical Section. Notice that the Critical Section is referenced by the function named draw_task.
  3. Double click on the function name "draw_task". The source and assembly view are now displayed. Note that there is a Critical Section that is used in the draw_task function (you may need to scroll down the source code to see it).

    This Critical Section is actually not needed - the "for" loop is already thread safe. It was accidentally put in as an extra precaution. This is a source of some unneeded serial, "poor CPU usage" time.
  4. To improve the speed and parallelism of the application comment out the calls to "EnterCriticalSection" and "LeaveCriticalSection" inside draw_task, then rebuild. Then run the app and notice the execution time in the title bar when the app completes.
  5. Rerun the Concurrency analysis again as you did in Lab 2 and see if the app is now faster. Also, see if the offending Critical Section is no longer on the list of sync objects.

 

Review Questions

 

 

原文地址:https://www.cnblogs.com/ustc-cui/p/3753132.html