IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0661527
(2012-10-26)
|
등록번호 |
US-8504732
(2013-08-06)
|
발명자
/ 주소 |
- Faraj, Daniel A.
- Smith, Brian E.
|
출원인 / 주소 |
- International Business Machines Corporation
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
0 인용 특허 :
41 |
초록
▼
Administering connection identifiers for collective operations in a parallel computer, including prior to calling a collective operation, determining, by a first compute node of a communicator to receive an instruction to execute the collective operation, whether a value stored in a global connectio
Administering connection identifiers for collective operations in a parallel computer, including prior to calling a collective operation, determining, by a first compute node of a communicator to receive an instruction to execute the collective operation, whether a value stored in a global connection identifier utilization buffer exceeds a predetermined threshold; if the value stored in the global ConnID utilization buffer does not exceed the predetermined threshold: calling the collective operation with a next available ConnID including retrieving, from an element of a ConnID buffer, the next available ConnID and locking the element of the ConnID buffer from access by other compute nodes; and if the value stored in the global ConnID utilization buffer exceeds the predetermined threshold: repeatedly determining whether the value stored in the global ConnID utilization buffer exceeds the predetermined threshold until the value stored in the global ConnID utilization buffer does not exceed the predetermined threshold.
대표청구항
▼
1. A method of administering connection identifiers for collective operations in a parallel computer, the method comprising: prior to calling a collective operation, determining, by a first compute node of a communicator to receive an instruction to execute the collective operation, whether a value
1. A method of administering connection identifiers for collective operations in a parallel computer, the method comprising: prior to calling a collective operation, determining, by a first compute node of a communicator to receive an instruction to execute the collective operation, whether a value stored in a global connection identifier (‘ConnID’) utilization buffer exceeds a predetermined threshold, the value stored in the global ConnID utilization buffer representing a number of connection identifiers in use;if the value stored in the global ConnID utilization buffer does not exceed the predetermined threshold: calling the collective operation with a next available ConnID including, atomically: retrieving, from an element of a ConnID buffer, the next available ConnID and locking the element of the ConnID buffer from access by other compute nodes; andif the value stored in the global ConnID utilization buffer exceeds the predetermined threshold: repeatedly determining whether the value stored in the global ConnID utilization buffer exceeds the predetermined threshold until the value stored in the global ConnID utilization buffer does not exceed the predetermined threshold. 2. The method of claim 1 wherein determining whether the value stored in the global ConnID utilization buffer exceeds the predetermined threshold further comprises: atomically fetching, by a DMA engine of the first compute node, the value stored in the global ConnID utilization buffer and incrementing the stored value; anddetermining whether the fetched value exceeds the predetermined threshold. 3. The method of claim 2 further comprising: upon completion of the collective operation, atomically:unlocking the element of the ConnID buffer storing the retrieved ConnID; anddecrementing the value stored in the global ConnID utilization buffer. 4. The method of claim 1 wherein calling the collective operation with the next available ConnID further comprises: placing, by a DMA engine of the first compute node, in a predefined memory location in all other nodes of the communicator a value representing an instruction to wait for a ConnID; andupon retrieving the next available ConnID, placing, by the DMA engine, in predefined memory location in all other nodes of the communicator, the retrieved ConnID; andupon completion of the collective operation, the method further comprises clearing, from the predefined memory location in all the other nodes of the communicator, the retrieved ConnID. 5. The method of claim 1 wherein calling the collective operation with the next available ConnID further comprises: determining, by the first node, whether a ConnID is stored in a predefined memory location of a master node of the communicator;atomically: retrieving, from the element of the ConnID buffer, the next available ConnID and locking the element of the ConnID buffer from access by other compute nodes only if no ConnID is stored in the predefined memory location of the master node;again determining, after retrieving the next available ConnID, whether a ConnID is stored in the predefined memory location of the master node;if there is a ConnID stored in the predefined memory location of the master node after retrieving the next available ConnID, atomically unlocking the element of the ConnID buffer;if there is no ConnID stored in the predefined memory location of the master node after retrieving the next available ConnID:placing, by a DMA engine of the first node, the retrieved ConnID in the predefined memory location of the master node; andupon completion of the collective operation, the method further comprises clearing the retrieved ConnID from the predefined memory location of the master node. 6. The method of claim 1 wherein retrieving, from an element of a ConnID buffer, a next available ConnID further comprises incrementing a ConnID buffer pointer to a next, unlocked element of the ConnID buffer.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.