IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0360286
(1999-07-22)
|
발명자
/ 주소 |
- Tellez, Juan
- Dageville, Benoit
|
출원인 / 주소 |
- Oracle International Corporation
|
대리인 / 주소 |
Hickman Palermo Truong & Becker LLP
|
인용정보 |
피인용 횟수 :
11 인용 특허 :
5 |
초록
▼
A method and apparatus are provided for computing degrees of parallelism for parallel operations in a computer system. The degree of parallelism for a given parallel operation is computed based on a set of factors. The set of factors includes a target degree of parallelism that represents a desired
A method and apparatus are provided for computing degrees of parallelism for parallel operations in a computer system. The degree of parallelism for a given parallel operation is computed based on a set of factors. The set of factors includes a target degree of parallelism that represents a desired total amount of parallelism in the computer system, a current workload and a requested degree of parallelism.
대표청구항
▼
1. A method for computing degrees of parallelism for parallel operations in a computer system, the method comprising the steps of:receiving a request to perform a parallel operation;computing the degree of parallelism for the parallel operation based on a set of factors, wherein the set of factors i
1. A method for computing degrees of parallelism for parallel operations in a computer system, the method comprising the steps of:receiving a request to perform a parallel operation;computing the degree of parallelism for the parallel operation based on a set of factors, wherein the set of factors includesa target degree of parallelism that represents a desired total amount of parallelism in the computer system, anda current workload of the computer system; andassigning the degree of parallelism to the parallel operation. 2. The method of claim 1 wherein:the parallel operation is associated with a requested degree of parallelism; andthe requested degree of parallelism associated with the parallel operation is also a factor in the set of factors used to compute the degree of parallelism. 3. The method of claim 2 further comprising the step of, when the requested degree of parallelism is not pre-determined for the parallel operation, assigning to the parallel operation the target degree of parallelism as the requested degree of parallelism for the parallel operation. 4. The method of claim 1 further comprising the step of assigning the target degree of parallelism to the parallel operation when the current workload of the computer system is zero. 5. The method of claim 2 further comprising the step of assigning the requested degree of parallelism to the parallel operation when the current workload of the computer system is zero. 6. The method of claim 1 wherein the target degree of parallelism is equal to twice the number of processors in the computer system. 7. The method of claim 1 whereinthe set of factors includes a reduction factor;the step of computing the degree of parallelism for the parallel operation further comprises the steps of:computing the reduction factor; andapplying the reduction factor to a requested degree of parallelism. 8. The method of claim 7 wherein the step of computing the reduction factor comprises the steps of:computing a projected load factor; andcomputing a rate of change based on a total number of parallel operations on the computer system. 9. The method of claim 7 wherein the step of computing the reduction factor comprises computing a rate of change based on a user_limit and a user_ratio wherein:the user_limit is an upper limit on the number of parallel operations; andthe user_ratio is the number of parallel operations. 10. The method of claim 9 wherein the user_limit is the upper limit on the number of parallel users and the user_ratio is the number of parallel users. 11. The method of claim 9 wherein the user_ratio is a run queue length. 12. The method of claim 9 wherein the user_ratio is an amount of memory being used in the computer system. 13. The method of claim 7 wherein the step of computing the reduction factor is based on a default degree of parallelism. 14. The method of claim 13 wherein the default degree of parallelism is equal to twice the number of processors in the computer system. 15. The method of claim 7 wherein the step of applying the reduction factor to the requested degree of parallelism is dividing the requested degree of parallelism by the reduction factor. 16. The method of claim 7 further comprises equating the degree of parallelism of the parallel operation to the requested degree of parallelism if the reduction factor is zero. 17. The method of claim 1 wherein the computer system comprises more than one node and wherein each node comprises multiple processors. 18. The method of claim 17 wherein:requested slave processes are distributed among the nodes;the requested slave processes are slave processes equal to the degree of parallelism assigned to the parallel operation;a unit of allocation is a fraction of the number of requested slave processes;the method of distributing the requested slave processes among the nodes comprises the steps of:sorting the nodes by workload in ascending order to obtain a sorted list of nodes;beginning a first sequence by allocating a fin al unit of allocation to a first node on the sorted list of nodes if the first node has a workload that is less than a target workload; andcontinuing to allocate the final unit of allocation to successive nodes on the sorted list of nodes if the workload on each successive node is less than the target workload and if requested slave processes remain undistributed;returning to the first node of the sorted list of nodes upon reaching a node having workload greater than the target workload;beginning a second sequence for allocating the final unit of allocation to the first node on the sorted list of nodes if requested slave processes remain undistributed;continuing to allocate the final unit of allocation to successive nodes on the sorted list of nodes if requested slave processes remain undistributed;repeating the second sequence upon reaching the end of the sorted list of nodes; andcontinuing to repeat the second sequence if requested slave processes remain undistributed. 19. The method of claim 18 further comprises the steps of:calculating the final unit of allocation if a set of conditions are satisfied; andsetting the final unit of allocation equal to the number of processors on the computer system if the set of conditions are not satisfied. 20. The method of claim 19 wherein the set of conditions include:the difference between a least loaded node and a most loaded node is small;the requested slave processes are not completely divisible by the number of nodes;the requested slave processes, when divided by the number of nodes, is greater than or equal to 2; andthe requested slave processes do not all fit in one node. 21. The method of claim 19 wherein the step of calculating the final unit of allocation further comprises the steps of:initializing the unit of allocation to create an initial unit of allocation equal to the number of requested slave processes; andreducing the initial unit of allocation to produce the final unit of allocation by successively dividing the initial unit of allocation by a factor of two until the final unit of allocation is less than the number of processors on the computer system. 22. The method of claim 18 wherein the requested slave processes are not distributed if the first node on the sorted list of nodes has the workload that is greater than the target workload and wherein the target workload is the number of slave processes equal to twice the number of processors on the computer system. 23. A computer-readable medium carrying one or more sequences of instructions for computing degrees of parallelism for parallel operations in a computer system, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:receiving a request to perform a parallel operation;computing the degree of parallelism for the parallel operation based on a set of factors, wherein the set of factors includesa target degree of parallelism that represents a desired total amount of parallelism in the computer system, anda current workload of the computer system; andassigning the degree of parallelism to the parallel operation. 24. The computer readable medium of claim 23 wherein:the parallel operation is associated with a requested degree of parallelism; andthe requested degree of parallelism associated with the parallel operation is also a factor in the set of factors used to compute the degree of parallelism. 25. The computer readable medium of claim 24 further comprising the step of, when the requested degree of parallelism is not pre-determined for the parallel operation, assigning to the parallel operation the target degree of parallelism as the requested degree of parallelism for the parallel operation. 26. The computer readable medium of claim 23 further comprising the step of assigning the target degree of parallelism to the parallel operation when the current workload of the computer system is zero. 27. The computer readable medium of claim 24 further comprising the step of assigning the requested degree of parallelism to the parallel operation when the current workload of the computer system is zero. 28. The computer readable medium of claim 23 wherein the target degree of parallelism is equal to twice the number of processors in the computer system. 29. The computer readable medium of claim 23 whereinthe set of factors includes a reduction factor;the step of computing the degree of parallelism for the parallel operation further comprises the steps of:computing the reduction factor; andapplying the reduction factor to a requested degree of parallelism. 30. The computer readable medium of claim 29 wherein the step of computing the reduction factor comprises the steps of:computing a projected load factor; andcomputing a rate of change based on a total number of parallel operations on the computer system. 31. The computer readable medium of claim 29 wherein the step of computing the reduction factor comprises computing a rate of change based on a user_limit and a user_ratio wherein:the user_limit is an upper limit on the number of parallel operations; andthe user_ratio is the number of parallel operations. 32. The computer readable medium of claim 31 wherein the user_limit is the upper limit on the number of parallel users and the user_ratio is the number of parallel users. 33. The computer readable medium of claim 31 wherein the user_ratio is a run queue length. 34. The computer readable medium of claim 31 wherein the user_ratio is an amount of memory being used in the computer system. 35. The computer readable medium of claim 29 wherein the step of computing the reduction factor is based on a default degree of parallelism. 36. The computer readable medium of claim 35 wherein the default degree of parallelism is equal to twice the number of processors in the computer system. 37. The computer readable medium of claim 29 wherein the step of applying the reduction factor to the requested degree of parallelism is dividing the requested degree of parallelism by the reduction factor. 38. The computer readable medium of claim 29 further comprises equating the degree of parallelism of the parallel operation to the requested degree of parallelism if the reduction factor is zero. 39. The computer readable medium of claim 23 wherein the computer system comprises more than one node and wherein each node comprises multiple processors. 40. The computer readable medium of claim 39 wherein:requested slave processes are distributed among the nodes;the requested slave processes are slave processes equal to the degree of parallelism assigned to the parallel operation;a unit of allocation is a fraction of the number of requested slave processes;the method of distributing the requested slave processes among the nodes comprises the steps of:sorting the nodes by workload in ascending order to obtain a sorted list of nodes;beginning a first sequence by allocating a final unit of allocation to a first node on the sorted list of nodes if the first node has a workload that is less than a target workload; andcontinuing to allocate the final unit of allocation to successive nodes on the sorted list of nodes if the workload on each successive node is less than the target workload and if requested slave processes remain undistributed;returning to the first node of the sorted list of nodes upon reaching a node having workload greater than the target workload;beginning a second sequence for allocating the final unit of allocation to the first node on the sorted list of nodes if requested slave processes remain undistributed;continuing to allocate the final unit of allocation to successive nodes on the sorted list of nodes if requested slave processes remain undistributed;repeating the second sequence upon reaching the end of the sorted list of nodes; andcontinuing to repeat the second sequence if requested slave processes remain undistributed. 41. The computer readable medium of claim 40 further comprises the steps of:calculating the final unit of allocation if a set of conditions are satisfied; andsetting the final unit of allocation equal to the number of processors on the computer system if the set of conditions are not satisfied. 42. The computer readable medium of claim 41 wherein the set of conditions include:the difference between a least loaded node and a most loaded node is small;the requested slave processes are not completely divisible by the number of nodes;the requested slave processes, when divided by the number of nodes, is greater than or equal to 2; andthe requested slave processes do not all fit in one node. 43. The computer readable medium of claim 41 wherein the step of calculating the final unit of allocation further comprises the steps of:initializing the unit of allocation to create an initial unit of allocation equal to the number of requested slave processes; andreducing the initial unit of allocation to produce the final unit of allocation by successively dividing the initial unit of allocation by a factor of two until the final unit of allocation is less than the number of processors on the computer system. 44. The computer readable medium of claim 40 wherein the requested slave processes are not distributed if the first node on the sorted list of nodes has the workload that is greater than the target workload and wherein the target workload is the number of slave processes equal to twice the number of processors on the computer system. 45. A system for computing degrees of parallelism for parallel operations in a computer system, the system comprising a memory having one or more sequences of instructions, wherein execution of the one or more sequences of instructions by one or more processors causes the one or more processors to perform the steps of:receiving a request to perform a parallel operation;computing the degree of parallelism for the parallel operation based on a set of factors, wherein the set of factors includesa target degree of parallelism that represents a desired total amount of parallelism in the computer system, anda current workload of the computer system; andassigning the degree of parallelism to the parallel operation. 46. The system of claim 45, wherein:the parallel operation is associated with a requested degree of parallelism; andthe requested degree of parallelism associated with the parallel operation is also a factor in the set of factors used to compute the degree of parallelism. 47. The system of claim 46, wherein the memory further comprises one or more additional sequences of instructions which, when executed by the one or more processors, cause the one or more processors to perform the step of, when the requested degree of parallelism is not pre-determined for the parallel operation, assigning to the parallel operation the target degree of parallelism as the requested degree of parallelism for the parallel operation. 48. The system of claim 45, wherein the memory further comprises one or more additional sequences of instructions which, when executed by the one or more processors, cause the one or more processors to perform the step of assigning the target degree of parallelism to the parallel operation when the current workload of the computer system is zero. 49. The system of claim 46, wherein the memory further comprises one or more additional sequences of instructions which, when executed by the one or more processors, cause the one or more processors to perform the step of assigning the requested degree of parallelism to the parallel operation when the current workload of the computer system is zero. 50. The system of claim 45, wherein the target degree of parallelism is equal to twice the number of processors in the computer system. 51. The system of claim 45, whereinthe set of factors includes a reduction factor;the step of computing the degree of parallelism for the parallel operation includes:computing the reduction factor; andapplying the re duction factor to a requested degree of parallelism. 52. The system of claim 51, wherein the step of computing the reduction factor includes:computing a projected load factor; andcomputing a rate of change based on a total number of parallel operations on the computer system. 53. The system of claim 51, wherein the step of computing the reduction factor includes computing a rate of change based on a user_limit and a user_ratio wherein:the user_limit is an upper limit on the number of parallel operations; andthe user_ratio is the number of parallel operations. 54. The system of claim 53, wherein the user_limit is the upper limit on the number of parallel users and the user_ratio is the number of parallel users. 55. The system of claim 53, wherein the user_ratio is a run queue length. 56. The system of claim 53, wherein the user_ratio is an amount of memory being used in the computer system. 57. The system of claim 51, wherein the step of computing the reduction factor is based on a default degree of parallelism. 58. The system of claim 57, wherein the default degree of parallelism is equal to twice the number of processors in the computer system. 59. The system of claim 51, wherein the step of applying the reduction factor to the requested degree of parallelism is dividing the requested degree of parallelism by the reduction factor. 60. The system of claim 51, wherein the memory further comprises one or more additional sequences of instructions which, when executed by the one or more processors, cause the one or more processors to perform the step of equating the degree of parallelism of the parallel operation to the requested degree of parallelism if the reduction factor is zero. 61. The system of claim 45, wherein the computer system comprises more than one node and wherein each node comprises multiple processors. 62. The system of claim 61, wherein:requested slave processes are distributed among the nodes;the requested slave processes are slave processes equal to the degree of parallelism assigned to the parallel operation;a unit of allocation is a fraction of the number of requested slave processes;distributing the requested slave processes among the nodes includes:sorting the nodes by workload in ascending order to obtain a sorted list of nodes;beginning a first sequence by allocating a final unit of allocation to a first node on the sorted list of nodes if the first node has a workload that is less than a target workload; andcontinuing to allocate the final unit of allocation to successive nodes on the sorted list of nodes if the workload on each successive node is less than the target workload and if requested slave processes remain undistributed;returning to the first node of the sorted list of nodes upon reaching a node having workload greater than the target workload;beginning a second sequence for allocating the final unit of allocation to the first node on the sorted list of nodes if requested slave processes remain undistributed;continuing to allocate the final unit of allocation to successive nodes on the sorted list of nodes if requested slave processes remain undistributed;repeating the second sequence upon reaching the end of the sorted list of nodes; andcontinuing to repeat the second sequence if requested slave processes remain undistributed. 63. The system of claim 62, wherein the memory further comprises one or more additional sequences of instructions which, when executed by the one or more processors, cause the one or more processors to perform the steps of:calculating the final unit of allocation if a set of conditions are satisfied; andsetting the final unit of allocation equal to the number of processors on the computer system if the set of conditions are not satisfied. 64. The system of claim 63, wherein the set of conditions include:the difference between a least loaded node and a most loaded node is small;the requested slave processes are not completely divisible by the number of nodes;the requested slave processes, when divided by the number of nodes, is greater than or equal to two; andthe requested slave processes do not all fit in one nodes. 65. The system of claim 63, wherein the step of calculating the final unit of allocation includes:initializing the unit of allocation to create an initial unit of allocation equal to the number of requested slave processes; andreducing the initial unit of allocation to produce the final unit of allocation by successively dividing the initial unit of allocation by a factor of two until the final unit of allocation is less than the number of processors on the computer system. 66. The system of claim 62, wherein the requested slave processes are not distributed if the first node on the sorted list of nodes has the workload that is greater than the target workload and wherein the target workload is the number of slave processes equal to twice the number of processors on the computer system.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.