IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0689336
(2003-10-20)
|
등록번호 |
US-7437729
(2008-10-14)
|
우선권정보 |
GB-0309214.5(2003-04-23) |
발명자
/ 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
0 인용 특허 :
22 |
초록
▼
A method for balancing the load of a parallel processing system having a plurality of parallel processing elements arranged in a loop, wherein each processing element has a local number of tasks associated therewith comprising determining within each processing element a total number of tasks presen
A method for balancing the load of a parallel processing system having a plurality of parallel processing elements arranged in a loop, wherein each processing element has a local number of tasks associated therewith comprising determining within each processing element a total number of tasks present within the loop, calculating a local mean number of tasks within each processing element, and calculating a local deviation within each processing element. The method also comprises determining the sum deviations within each processing element for one-half the loop in an anti-clockwise direction and in a clockwise direction, determining clockwise and anti-clockwise transfer parameters within each processing element, and redistributing tasks among the processing elements in response to the clockwise and anti-clockwise transfer parameters.
대표청구항
▼
What is claimed is: 1. A method for balancing the load of a parallel processing system having a plurality of parallel processing elements arranged in a loop, wherein each processing element (PEr) has a local number of tasks associated therewith, wherein r represents the number for a selected proces
What is claimed is: 1. A method for balancing the load of a parallel processing system having a plurality of parallel processing elements arranged in a loop, wherein each processing element (PEr) has a local number of tasks associated therewith, wherein r represents the number for a selected processing element, and wherein each of said processing elements is operable to communicate with a clockwise adjacent processing element and with an anti-clockwise adjacent processing element, the method comprising: determining within each processing element a total number of tasks present within said loop; calculating a local mean number of tasks within each of said plurality of processing elements; calculating a local deviation from said local mean number within each of said plurality of processing elements; determining a sum deviation from said local deviations within each of said processing elements for one-half said loop in an anti-clockwise direction, said one-half of said loop being relative to each of said selected processing elements; determining a sum deviation from said local deviations within each of said processing elements for one-half said loop in a clockwise direction, said one-half of said loop being relative to each of said selected processing elements; determining a clockwise transfer parameter and an anti-clockwise transfer parameter from said sum deviations within each of said processing elements; and redistributing tasks among said plurality of processing elements using said clockwise transfer parameters and said anti-clockwise parameters within each of said plurality of processing elements. 2. The method of claim 1 wherein said determining within each of said processing elements a total number of tasks present within said loop, comprises: transmitting said local number of tasks associated with each of said processing elements to each other of said plurality of processing elements within said loop; receiving within each of said processing elements said number of local tasks associated with said each other of said plurality of processing elements; and summing said number of local tasks associated with each of said processing elements with said number of local tasks associated with each other of said plurality of processing elements. 3. The method of claim 1 wherein said determining said total number of tasks present within said loop includes solving the equation where N represents the number of processing elements in said loop, and vi represents said local number of tasks associated with ith processing element in said loop. 4. The method of claim 1 wherein said calculating a local mean number of tasks within each of said plurality of processing elements includes solving the equation Mr=Trunc((V+Er)/N), where Mr is said local mean for PEr, N is the total number of processing elements in said loop, and Er is a number in the range of 0 to (N-1), V is the total number of tasks, and wherein each processing element has a different Er value. 5. The method of claim 4 wherein said Trunc function is responsive to the value of Er such that said total number of tasks for said loop is equal to the sum of the local mean number of tasks for each of said plurality of processing elements in said loop (i.e., 6. The method of claim 4 wherein said local mean Mr=Trunc((V+Er)/N) for each local PEr within said loop is equal to either X or (X+1), and Er is a number in the range of 0 to (N-1) and X represents the local mean. 7. The method of claim 1 wherein said calculating a local deviation within each of said plurality of processing elements, comprises finding the difference between said local number of tasks and said local mean number for each of said plurality of processing elements. 8. The method of claim 1 wherein said determining a sum deviation within each of said processing elements for one-half of said loop in an anti-clockwise direction comprises: transmitting said local deviation associated with each of said processing elements half way around said loop in an anti-clockwise direction, said one-half of said loop being relative to each of said selected processing elements; receiving said local deviation associated with each other of said processing elements halfway around said loop in a clockwise direction, said one-half of said loop being relative to each of said selected processing elements; and summing said local deviations associated with each other of said processing elements halfway around said loop in a clockwise direction. 9. The method of claim 1 wherein said determining a sum deviation within each of said processing elements in one-half of said loop in a clockwise direction comprises: transmitting said local deviation associated with each of said processing elements half way around said loop in an clockwise direction, said one-half of said loop being relative to each of said selected processing elements; receiving said local deviation associated with each other of said processing elements half way around said loop in a anti-clockwise direction, said one-half of said loop being relative to each of said selected processing elements; and summing said local deviations associated with each other of said processing elements half way around said loop in an anti-clockwise direction. 10. The method of claim 1 wherein said determining a clockwise transfer parameter and an anti-clockwise transfer parameter within each of said processing elements comprises: setting the clockwise transfer parameter equal to (2S+A-C)��4; and setting the anti-clockwise transfer parameter equal to (2S+C-A)��4, where S represents said local deviation of a selected processing element; C represents said sum deviation in a clockwise half of loop relative to said selected processing element, and A represents said sum deviation in an anti-clockwise half of loop relative to said selected processing element. 11. The method of claim 1 wherein said determining a clockwise transfer parameter Tc and an anti-clockwise transfer parameter Ta within each of said processing elements comprises at least one of: setting the clockwise transfer parameter equal to Trunc[(2S+Δ)��4] and setting the anti-clockwise transfer parameter equal to S-Tc and setting the anti-clockwise transfer parameter equal to Trunc[(2S-Δ)��4] and setting the clockwise transfer parameter equal to S-Ta; where Mag=abs(2S), S represents the local deviation of a selected processing element, Δ represents the number of tasks passing though the current processing element, whereby if Δ>Mag then set Δ equal to Mag and if Δr, =Trunc((V+Er)/N), where Mr is said local mean for a processing element PEr, N is the total number of processing elements in said loop, V is the total number of tasks, and Er is a number in the range of 0 to (N-1). 15. The method of claim 14 wherein said Trunc function is responsive to the value of Er such that said total number of tasks for said loop is equal to the sum of the local mean number of tasks for each of said plurality of processing elements in said loop and wherein each processing element has a different Er value assigned. 16. The method of claim 12 wherein said inserting a phantom processing element within said loop further comprises: locating said phantom processing element in a position within said loop that is diametrically opposed to said processing element. 17. The method of claim 12 wherein said computing a local mean value for a selected processing element, said computing a local deviation for said selected processing element said inserting a phantom processing element within said loop, said summing said deviation of said processing elements located within one-half of the loop in an anti-clockwise direction, summing said deviation of said processing elements located within one-half of the loop in a clockwise direction, computing a number of tasks to transfer in a clockwise direction for said selected processing element, computing a number of tasks to transfer in an anti-clockwise direction for said selected processing element, and reassigning tasks relative to the said number of task to transfer in a clockwise direction and said number of tasks to transfer in an anti-clockwise direction are completed simultaneously for each of said plurality of processing elements within said loop. 18. The method of claim 12 wherein said summing said deviation of said processing elements located within one-half of the loop in an anti-clockwise direction relative to said selected processing element comprises: transmitting said local deviation associated with each of said processing elements half way around said loop in an anti-clockwise direction, said one-half of said loop being relative to each of said selected processing elements; receiving said local deviation associated with each other of said processing elements half way around said loop in a clockwise direction, said one-half of said loop being relative to each of said selected processing elements; and summing said local deviations associated with each other of said processing elements half way around said loop in a clockwise direction. 19. The method of claim 12 wherein summing said deviation of said processing elements located within one-half of the loop in a clockwise direction relative to said selected processing element comprises: transmitting said local deviation associated with each of said processing elements half way around said loop in an clockwise direction, said one-half of said loop being relative to each of said selected processing elements; receiving said local deviation associated with each other of said processing elements half way around said loop in a anti-clockwise direction, said one-half of said loop being relative to each of said selected processing elements; and summing said local deviations associated with each other of said processing elements halfway around said loop in an anti-clockwise direction. 20. A computer storage media device carrying a set of instructions which, when executed, perform a method comprising: determining within each processing element a total number of tasks present within said loop; calculating a local mean number of tasks within each of said plurality of processing elements; calculating a local deviation from said local mean number within each of said plurality of processing elements; determining a sum deviation from said local deviations within each of said processing elements for one-half said loop in an anti-clockwise direction, said one-half of said loop being relative to each of said selected processing elements; determining a sum deviation from said local deviations within each of said processing elements for one-half said loop in a clockwise direction, said one-half of said loop being relative to each of said selected processing elements; determining a clockwise transfer parameter and an anti-clockwise transfer parameter from said sum deviations within each of said processing elements; and redistributing tasks among said plurality of processing elements using said clockwise transfer parameters and said anti-clockwise parameters within each of said plurality of processing elements.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.