A method of executing a loop over an integer index range of indices in a parallel manner includes assigning a plurality of index subsets of the integer index range to a corresponding plurality of threads, and defining for each index subset a start point of the index subset, an end point of the index
A method of executing a loop over an integer index range of indices in a parallel manner includes assigning a plurality of index subsets of the integer index range to a corresponding plurality of threads, and defining for each index subset a start point of the index subset, an end point of the index subset, and a boundary point of the index subset positioned between the start point and the end point of the index subset. A portion of the index subset between the start point and the boundary point represents a private range and the portion of the index subset between the boundary point and the end point represents a public range. Loop code is executed by each thread based on the index subset of the integer index range assigned to the thread.
대표청구항▼
1. A method of executing a loop over an integer index range of indices in a parallel manner, comprising: assigning a plurality of index subsets of the integer index range to a corresponding plurality of threads;defining for each index subset a start point of the index subset, an end point of the ind
1. A method of executing a loop over an integer index range of indices in a parallel manner, comprising: assigning a plurality of index subsets of the integer index range to a corresponding plurality of threads;defining for each index subset a start point of the index subset, an end point of the index subset, and a boundary point of the index subset positioned between the start point and the end point of the index subset, a portion of the index subset between the start point and the boundary point represents a private range and the portion of the subset between the boundary point and the end point represents a public range, each private range being executable by the thread to which it is assigned without synchronization with other threads of the plurality of threads and each public range being accessible by the plurality of threads; andexecuting loop code with each thread based on the index subset of the integer index range assigned to the thread. 2. The method of claim 1, wherein the public range of an index subset assigned to a thread represents a range of indices that are configured to be processed by any one of the threads based on synchronization with other ones of the threads. 3. The method of claim 1, wherein each thread is configured to first process indices in the private range of the index subset assigned to the thread, followed by indices in the public range of the index subset. 4. The method of claim 3, wherein each thread is configured to process indices in the public range of the index subset assigned to the thread by moving the boundary point of the index subset toward the end point, and processing indices positioned between an old position of the boundary point and a new position of the boundary point. 5. The method of claim 1, wherein the boundary point of each index subset is configured to be moved only by the thread to which the index subset is assigned. 6. The method of claim 5, wherein the boundary point is configured to be moved toward the start point to decrease the private range and increase the public range, and is configured to be moved toward the end point to increase the private range and decrease the public range. 7. The method of claim 6, wherein each thread is configured to move the boundary point of the index subset assigned to the thread toward the start point when the end point of the index subset has been moved by another thread to the boundary point. 8. The method of claim 6, wherein each thread is configured to move the boundary point of the index subset assigned to the thread toward the start point when the thread detects that it is blocked or about to be blocked, and wherein each thread is configured to decide whether to move the boundary point back toward the end point after the blocking. 9. The method of claim 1, wherein each thread is configured to steal indices from the public ranges of index subsets assigned to other threads after the indices in the index subset assigned to the thread have been processed. 10. The method of claim 9, wherein each thread is configured to steal indices from the public range of an index subset assigned to another thread by moving the end point of the index subset toward the boundary point in synchronization with the other threads, and executing indices positioned between a new position of the end point and an old position of the end point. 11. A computer-readable storage device comprising: a memory device configured for storing computer-executable instructions configured to execute a loop over an integer index range of indices in a parallel manner, the computer-executable instructions comprising: first instructions that assign a plurality of index subsets of the integer index range to a corresponding plurality of threads;second instructions that define for each index subset a start point of the index subset, an end point of the index subset, and a boundary point of the index subset positioned between the start point and the end point of the index subset, a portion of the index subset between the start point and the boundary point represents a private range and the portion of the index subset between the boundary point and the end point represents a public range, each private range being executable by the thread to which it is assigned without synchronization with other threads of the plurality of threads and each public range being accessible by the plurality of threads; andloop code instructions that execute with each thread based on the index subset of the integer index range assigned to the thread. 12. The computer-readable storage device of claim 11, wherein the public range of an index subset assigned to a thread represents a range of indices that are configured to be processed by any one of the threads based on synchronization with other ones of the threads. 13. The computer-readable storage device of claim 11, wherein each thread is configured to first process indices in the private range of the index subset assigned to the thread, followed by indices in the public range of the index subset. 14. The computer-readable storage device of claim 13, wherein each thread is configured to process indices in the public range of the index subset assigned to the thread by moving the boundary point of the index subset toward the end point, and processing indices positioned between an old position of the boundary point and a new position of the boundary point. 15. The computer-readable storage device of claim 11, wherein the boundary point of each index subset is configured to be moved only by the thread to which the index subset is assigned. 16. The computer-readable storage device of claim 15, wherein the boundary point is configured to be moved toward the start point to decrease the private range and increase the public range, and is configured to be moved toward the end point to increase the private range and decrease the public range. 17. The computer-readable storage device of claim 16, wherein each thread is configured to move the boundary point of the index subset assigned to the thread toward the start point when the end point of the index subset has been moved by another thread to the boundary point. 18. The computer-readable storage device of claim 16, wherein each thread is configured to move the boundary point of the index subset assigned to the thread toward the start point when the thread detects that it is blocked or about to be blocked, and wherein each thread is configured to decide whether to move the boundary point back toward the end point after the blocking. 19. A method of executing a loop over an integer index range of indices in a parallel manner, comprising: assigning a plurality of index subsets of the integer index range to a corresponding plurality of threads;defining for each index subset a start point of the index subset, an end point of the index subset, and a boundary point of the index subset positioned between the start point and the end point of the index subset, a portion of the index subset between the start point and the boundary point represents a private range and the portion of the index subset between the boundary point and the end point represents a public range, each thread being configured to steal indices from the public ranges of index subsets assigned to other threads after the indices in the index subset assigned to the thread have been processed; andexecuting loop code with each thread based on the index subset of the integer index range assigned to the thread. 20. The method of claim 19, wherein each thread is configured to steal indices from the public range of an index subset assigned to another thread by moving the end point of the index subset toward the boundary point in synchronization with the other threads, and executing indices positioned between a new position of the end point and an old position of the end point.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (5)
Hardwick Jonathan C.,GBX, Dynamic load balancing among processors in a parallel computer.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.