IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0643754
(2003-08-18)
|
등록번호 |
US-8307194
(2012-11-06)
|
발명자
/ 주소 |
- Scott, Steven L.
- Faanes, Gregory J.
- Stephenson, Brick
- Moore, Jr., William T.
- Kohn, James R.
|
출원인 / 주소 |
|
대리인 / 주소 |
Larkin Hoffman Daly & Lindgren
|
인용정보 |
피인용 횟수 :
26 인용 특허 :
116 |
초록
▼
A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency
A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency model are described. Further, a combined memory synchronization and barrier synchronization (Msync) for a multistreaming processor (MSP) system is described. Also, a global synchronization (Gsync) instruction provides synchronization even outside a single MSP system is described. Advantageously, the pipeline or queue of pending memory requests does not need to be drained before the synchronization operation, nor is it required to refrain from determining addresses for and inserting subsequent memory accesses into the pipeline.
대표청구항
▼
1. A computer processing method comprising: providing a computer system having a shared memory and a multistream processor (MSP), wherein the MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more v
1. A computer processing method comprising: providing a computer system having a shared memory and a multistream processor (MSP), wherein the MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the plurality of SSPs is operatively coupled to the memory;defining program order between operations on the first SSP;defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory;maintaining a minimal guarantee on the ordering using an active list located in the scalar section, wherein maintaining includes: placing each instruction in order in the active list, wherein placing includes initializing each instruction to a speculative status;determining if the speculative status instruction is branch speculative or trap speculative;if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present;if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “scalar committed” status and issuing a scalar commitment notice from the active list to the one or more vector sections;checking to see if all vector operands for the scalar committed status instruction are present;if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “committed” status;checking to see if all instructions previous to the committed status instruction are completed; andif all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “graduated” status; andmaintaining memory consistency between multiple vector memory references and between vector and scalar memory references by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order. 2. The method of claim 1, wherein maintaining memory consistency includes synchronizing memory references by executing a predefined Lsync operation within a local SSP and a predefined Msync operation among SSPs. 3. The method of claim 2, wherein the computer system includes two or more MSPs and a Gsync operation for synchronizing across the two or more MSPs, wherein maintaining memory consistency further includes executing the Gsync operation among all participating SSPs of the two or more MSPs. 4. The method of claim 1, wherein maintaining memory consistency includes monitoring a reference sent bit in the active list. 5. An apparatus comprising: a shared memory;one or more multistream processors (MSPs), wherein each MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the SSPs is operatively coupled to the memory;means for defining program order between operations on the first SSP;means for defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory;means for maintaining a minimal guarantee on the ordering on the first SSP using an active list located in the scalar section, wherein means for maintaining includes: means for placing each instruction in order in the active list, wherein means for placing includes means for initializing each instruction to a speculative status;means for determining if the speculative status instruction is branch speculative or trap speculative;means for, if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present;means for, if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “scalar committed” status and issuing a scalar commitment notice from the active list to the one or more vector sections;means for checking to see if all vector operands for the scalar committed status instruction are present;means for, if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “committed” status;means for checking to see if all instructions previous to the committed status instruction are completed; andmeans for, if all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “graduated” status; andmeans for maintaining memory consistency between the plurality of single stream processors (SSPs) by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order. 6. The apparatus of claim 5, means for maintaining memory consistency includes means for synchronizing memory references by executing a predefined Lsync operation within a local SSP and a predefined Msync operation among SSPs. 7. The apparatus of claim 6, wherein the apparatus includes two or more MSPs and a Gsync operation for synchronizing across the two or more MSPs, wherein means for maintaining memory consistency further includes means for executing the Gsync operation among all participating SSPs of the two or more MSPs. 8. The apparatus of claim 5, wherein means for maintaining memory consistency includes means for monitoring a reference sent bit in the active list. 9. A computer processing method comprising: providing a memory having a plurality of addressable locations;providing a multistream processor (MSP), wherein the MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the plurality of SSPs is operatively coupled to the memory;defining program order between operations on the first SSP;defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory;serializing writes to any given one of the plurality of addressable locations of memory in the order using an active list in the scalar section, wherein serializing includes: placing each instruction in order in the active list, wherein placing includes initializing each instruction to a speculative status;determining if the speculative status instruction is branch speculative or trap speculative;if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present;if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “scalar committed” status and issuing a scalar commitment notice from the active list to the one or more vector sections;checking to see if all vector operands for the scalar committed status instruction are present;if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “committed” status;checking to see if all instructions previous to the committed status instruction are completed; andif all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “graduated” status;making a write globally visible when no one of the plurality of SSPs can read the value produced by an earlier write in a sequential order of writes to that location;preventing an SSP from reading a value written by another MSP before that value becomes globally visible; andperforming memory consistency between the plurality of single stream processors (SSPs) by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order. 10. The method of claim 9, wherein performing memory consistency includes synchronizing memory references by executing a predefined Lsync operation within a local SSP and a predefined Msync operation among SSPs. 11. The method of claim 10, wherein the computer system includes two or more MSPs and a Gsync operation for synchronizing across the two or more MSPs, wherein performing memory consistency further includes executing the Gsync operation among all participating SSPs of the two or more MSPs. 12. The method of claim 9, wherein performing includes monitoring a reference sent bit in the active list. 13. An apparatus comprising: a memory having a plurality of addressable locations;one or more multistream processors (MSPs), wherein each MSP includes a plurality of single stream processors (SSPs) including a first SSP, each one of the plurality of SSPs having a scalar section and one or more vector sections, wherein each of the SSPs is operatively coupled to the memory;means for defining program order between operations on the first SSP;means for defining operation dependence order of vector memory references to the memory with respect to each other and with respect to scalar memory references to the memory;means for serializing writes to any given one of the plurality of addressable locations of memory in the order using an active list in the scalar section, wherein means for serializing includes: means for placing each instruction in order in the active list, wherein means for placing includes means for initializing each instruction to a speculative status;means for determining if the speculative status instruction is branch speculative or trap speculative;means for, if the speculative status instruction is neither branch speculative nor trap speculative, checking to see if all scalar operands for the speculative status instruction are present;means for, if all scalar operands for the speculative status instruction are present, moving the speculative status instruction to a “scalar committed” status and issuing a scalar commitment notice from the active list to the one or more vector sections;means for checking to see if all vector operands for the scalar committed status instruction are present;means for, if all vector operands for the scalar committed status instruction are present, moving the scalar committed status instruction to a “committed” status;means for checking to see if all instructions previous to the committed status instruction are completed; andmeans for, if all instructions previous to the committed status instruction are completed, moving the committed status instruction to a “graduated” status;means for making a write globally visible when no one of the plurality of SSPs can read the value produced by an earlier write in a sequential order of writes to that location;means for preventing an SSP from reading a value written by another MSP before that value becomes globally visible; andmeans for performing memory consistency between the plurality of single stream processors (SSPs) by guaranteeing no vector store reference can be sent to memory prior to a scalar or vector load that occurs earlier in the program order. 14. The apparatus of claim 13, wherein means for performing memory consistency includes means for executing a predefined Lsync operation within a local SSP and a predefined Msync operation among SSPs. 15. The apparatus of claim 14, wherein the computer system includes two or more MSPs and a Gsync operation for synchronizing across the two or more MSPs, wherein means for performing memory consistency further includes means for executing the Gsync operation among all the participating SSPs of the two or more MSPs. 16. The apparatus of claim 13, wherein means for performing includes means for monitoring a reference sent bit in the active list.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.