IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0643586
(2003-08-18)
|
등록번호 |
US-7334110
(2008-02-19)
|
발명자
/ 주소 |
- Faanes,Gregory J.
- Scott,Steven L.
- Lundberg,Eric P.
- Moore, Jr.,William T.
- Johnson,Timothy J.
|
출원인 / 주소 |
|
대리인 / 주소 |
Schwegman, Lundberg & Woessner, P.A.
|
인용정보 |
피인용 횟수 :
42 인용 특허 :
73 |
초록
▼
In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a system and method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising sending a ve
In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a system and method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising sending a vector instruction from the scalar processing unit to the vector dispatch unit, wherein sending includes marking the vector instruction as complete if the vector instruction is not a vector memory instruction and if the vector instruction does not require scalar operands, reading a scalar operand, wherein reading includes transferring the scalar operand from the scalar processing unit to the vector dispatch unit, predispatching the vector instruction within the vector dispatch unit if the vector instruction is scalar committed, dispatching the predispatched vector instruction if all required operands are ready, and executing the dispatched vector instruction as a function of the scalar operand.
대표청구항
▼
What is claimed is: 1. In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit and wherein the vector dispatch unit includes a predispatch queue and a dispatch queue, a method of decoupling operation of th
What is claimed is: 1. In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit and wherein the vector dispatch unit includes a predispatch queue and a dispatch queue, a method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising: dispatching vector instructions from the scalar processing unit to the vector dispatch unit, wherein dispatching includes sending the vector instructions from the scalar processing unit to the vector dispatch unit even if all scalar operands are not ready and even if all scalar instructions issued prior to the vector instructions are not scalar committed; queueing up the vector instructions received from the scalar processing unit in the vector dispatch unit's predispatch queue; reading scalar operands from the scalar processing unit, wherein reading includes transferring the scalar operands from the scalar processing unit to the vector dispatch unit; predispatching the vector instructions from the predispatch queue to the dispatch queue in the order received, wherein predispatching includes determining if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed and transferring the vector instruction from the predispatch queue to the dispatch queue only if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed; dispatching the predispatched vector instruction from the dispatch queue if all required scalar operands are ready; and executing the vector instruction dispatched from the dispatch queue as a function of the scalar operands. 2. The method according to claim 1, wherein executing the dispatched vector instruction includes translating an address associated with the vector instruction and trapping on a translation fault. 3. The method according to claim 1, wherein the method further includes marking the vector instruction as complete, wherein marking the vector instruction as complete includes: if the vector instruction is not a memory instruction and if the vector instruction does not require scalar operands, indicating that the vector instruction is complete when the vector instruction is dispatched from the scalar processing unit; if the vector instruction is not a memory instruction but requires scalar operands, indicating that the vector instruction is complete when the scalar operands are available; and if the vector instruction is a memory instruction, indicating that the vector instruction is complete when the vector address has been translated; and graduating the vector instruction if the vector instruction is marked complete. 4. In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit and wherein the vector dispatch unit includes a predispatch queue and a dispatch queue, a method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising: dispatching vector instructions from the scalar processing unit to the vector dispatch unit, wherein dispatching includes sending the vector instructions from the scalar processing unit to the vector dispatch unit even if all scalar operands are not ready and even if all scalar instructions issued prior to the vector instructions are not scalar committed; queueing up the vector instructions received from the scalar processing unit in the vector dispatch unit's predispatch queue; reading scalar operands from the scalar processing unit, wherein reading includes transferring the scalar operands from the scalar processing unit to the vector dispatch unit; predispatching, the vector instructions from the predispatch queue to the dispatch queue in the order received, wherein predispatching includes determining if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed and transferring the vector instruction from the predispatch queue to the dispatch queue only if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed; dispatching the predispatched vector instruction from the dispatch queue if all required scalar operands are ready; generating an address for a vector load; issuing a vector load request to memory; receiving vector data from memory; storing the vector data in a load buffer; transferring the vector data from the load buffer to a vector register; and executing the vector instruction dispatched from the vector dispatch queue on the vector data stored in the vector register. 5. The method according to claim 4, wherein the vector processing unit includes a vector execute unit and a vector load/store unit, wherein issuing a vector load request to memory includes issuing and executing vector memory references in the vector load/store unit when the vector load store unit has received the instruction and memory operands from the scalar processing unit. 6. The method according to claim 4, wherein storing the vector data in a load buffer and transferring the vector data from the load buffer to a vector register are decoupled from each other. 7. The method according to claim 4, wherein storing the vector data in a load buffer includes writing memory load data to the load buffer until all previous memory operations complete without fault. 8. The method according to claim 5, wherein storing the vector data in a load buffer and transferring the vector data from the load buffer to a vector register are decoupled from each other. 9. The method according to claim 5, wherein storing the vector data in a load buffer includes writing memory load data to the load buffer until all previous memory operations complete without fault. 10. In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit and wherein the vector dispatch unit includes a predispatch queue and a dispatch queue, a method of decoupling operation of the scalar processing unit from that of the vector processing unit, the method comprising: dispatching vector instructions from the scalar processing unit to the vector dispatch unit, wherein dispatching includes sending the vector instructions from the scalar processing unit to the vector dispatch unit even if all scalar operands are not ready and even if all scalar instructions issued prior to the vector instructions are not scalar committed; queueing up the vector instructions received from the scalar processing unit in the vector dispatch unit's predispatch queue; reading scalar operands from the scalar processing unit, wherein reading includes transferring the scalar operands from the scalar processing unit to the vector dispatch unit; predispatching the vector instructions from the predispatch queue to the dispatch queue in the order received, wherein predispatching includes determining if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed and transferring the vector instruction from the predispatch queue to the dispatch queue only if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed; dispatching the predispatched vector instruction from the dispatch queue if all required scalar operands are ready; generating a first and a second address for a vector load; issuing first and second vector load requests to memory; receiving vector data associated with the first and second addresses from memory; storing vector data associated with the first address in a first vector register; storing vector data associated with the second address in a second vector register; executing a vector instruction on the vector data stored in the first vector register; renaming the second vector register; and executing the vector instruction dispatched from the dispatch queue on the vector data stored in the renamed vector register. 11. The method according to claim 10, wherein the vector processing unit includes a vector execute unit and a vector load/store unit, wherein issuing a vector load request to memory includes issuing and executing vector memory references in the vector load/store unit when the vector load store unit has received the instruction and memory operands from the scalar processing unit. 12. A computer system, comprising: a scalar processing unit; and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit, a vector execute unit and a vector load/store unit, wherein the vector dispatch unit includes a predispatch queue and a dispatch queue; wherein the scalar processing unit dispatches a vector instructions to the vector dispatch unit even if all scalar operands associated with the instructions are not ready; wherein the scalar processing unit reads scalar operands and transfers the read scalar operands from the scalar processing unit to the vector dispatch unit; wherein the vector dispatch unit stores vector instructions received from the scalar processing unit in the vector dispatch unit's predispatch queue, predispatches the vector instructions from the predispatch queue to the dispatch queue in the order received; wherein the vector dispatch unit determines if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed and transfers the vector instruction from the predispatch queue to the dispatch queue only if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed; wherein the vector dispatch unit dispatches the predispatched vector instruction from the vector dispatch unit to one or more of the vector execute unit and the vector load/store unit if all required scalar operands are ready; wherein the vector load/store unit receives an instruction and memory operands from the scalar processing unit, issues and executes a vector memory load reference as a function of the instruction and the memory operands received from the scalar processing unit, and stores data received as a result of the vector memory reference in a load buffer; and wherein the vector execute unit issues the vector memory load instruction and transfers the data received as a result of the vector memory reference from the load buffer to a vector register. 13. The system according to claim 12, wherein the load buffer stores memory load data until it is determined that no previous memory operation will fail and, if no previous memory operations have failed, the load buffer transfers the data to the vector register. 14. In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit and wherein the vector dispatch unit includes a predispatch queue and a dispatch queue, a method of decoupling scalar and vector execution, comprising: dispatching scalar instructions to a scalar instruction queue in the scalar processing unit; dispatching a vector instruction that requires scalar operands from the scalar processing unit to the scalar instruction queue and to the predispatch queue, wherein dispatching includes sending the vector instruction from scalar processing unit to the predispatch queue even if all scalar operands are not ready; queueing up two or more vector instructions received from the scalar processing unit in the predispatch queue; executing the vector instruction in the scalar processing unit, wherein executing the vector instruction in the scalar processing unit includes writing a scalar operand to a scalar operand queue; predispatching, within the vector dispatch unit, the vector instructions from the predispatch queue to the dispatch queue in the order received, wherein predispatching includes determining if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed and transferring the vector instruction from the predispatch queue to the dispatch queue only if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed; notifying the vector processing unit that the scalar operand is available in the scalar operand queue; dispatching the predispatched vector instruction from the dispatch queue of the vector dispatch unit, if all required scalar operands are ready; and executing the vector instruction dispatched from the vector dispatch unit in the vector processing unit, wherein executing the vector instruction in the vector processing unit includes reading the scalar operand from the scalar operand queue. 15. The method according to claim 14, wherein the method further includes marking the vector instruction as complete, wherein marking the vector instruction as complete includes: if the vector instruction is not a memory instruction and if the vector instruction does not require scalar operands, indicating that the vector instruction is complete when the vector instruction is dispatched from the scalar processing unit; if the vector instruction is not a memory instruction but requires scalar operands, indicating that the vector instruction is complete when the scalar operands are available; and if the vector instruction is a memory instruction, indicating that the vector instruction is complete when the vector address has been translated; and graduating the vector instruction if the vector instruction is marked complete. 16. In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit and wherein the vector dispatch unit includes a predispatch queue and a dispatch queue, a method of decoupling a vector memory reference and a vector execution, comprising: dispatching scalar instructions to a scalar instruction queue in the scalar processing unit; dispatching a vector instruction from the scalar processing unit to the scalar instruction queue and to the predispatch queue in the vector processing unit, wherein dispatching includes sending the vector instruction from the scalar processing unit to the predispatch queue in the vector processing unit even if all scalar operands are not ready; queueing up two or more vector instructions received from the scalar processing unit in the predispatch queue; executing the vector instruction in the scalar processing unit, wherein executing the vector instruction in the scalar processing unit includes generating an address and writing the address to a scalar operand queue; predispatching, within the vector dispatch unit, the vector instructions from the predispatch queue to the dispatch queue in the order received, wherein predispatching includes determining if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed and transferring the vector instruction from the predispatch queue to the dispatch queue only if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed; notifying the vector processing unit that the address is available in the scalar operand queue; dispatching the predispatched vector instruction from the dispatch queue of the vector processing unit if all required scalar operands are ready; and executing the vector instruction dispatched from the vector processing unit in the vector processing unit, wherein executing the vector instruction in the vector processing unit includes reading the address from the scalar operand queue and generating a memory request as a function of the address read from the scalar operand queue. 17. The method according to claim 16, wherein the method further includes marking the vector instruction as complete, wherein marking the vector instruction as complete includes: if the vector instruction is not a memory instruction and if the vector instruction does not require scalar operands, indicating that the vector instruction is complete when the vector instruction is dispatched from the scalar processing unit; if the vector instruction is not a memory instruction but requires scalar operands, indicating that the vector instruction is complete when the scalar operands are available; and if the vector instruction is a memory instruction, indicating that the vector instruction is complete when the vector address has been translated; and graduating the vector instruction if the vector instruction is marked complete. 18. In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit and wherein the vector dispatch unit includes a predispatch queue and a dispatch queue, a method of executing a vector instruction, comprising: dispatching scalar instructions to a scalar instruction queue in the scalar processing unit; dispatching a vector instruction from the scalar processing unit to the scalar instruction queue and to the predispatch queue in the vector processing unit, wherein dispatching includes sending the vector instruction from the scalar processing unit to the predispatch queue in the vector processing unit even if all scalar operands are not ready; queueing up two or more vector instructions received from the scalar processing unit in the predispatch queue; executing the vector instruction in the scalar processing unit, wherein executing the vector instruction in the scalar processing unit includes generating an address and writing the address to a scalar operand queue; predispatching, within the vector dispatch unit, the vector instructions from the predispatch queue to the dispatch queue in the order received, wherein predispatching includes determining if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed and transferring the vector instruction from the predispatch queue to the dispatch queue only if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed; notifying the vector processing unit that the address is available in the scalar operand queue; dispatching the predispatched vector instruction from the dispatch queue of the vector processing unit if all required scalar operands are ready; and executing the vector instruction dispatched from the vector processing unit in the vector processing unit, wherein executing the dispatched vector instruction in the vector processing unit includes: reading the address from the scalar operand queue; generating a memory request as a function of the address read from the scalar operand queue; receiving vector data from memory; storing the vector data in a load buffer; transferring the vector data from the load buffer to a vector register; and executing a vector instruction on the vector data stored in the vector register. 19. In a computer system having a scalar processing unit and a vector processing unit, wherein the vector processing unit includes a vector dispatch unit and wherein the vector dispatch unit includes a predispatch queue and a dispatch queue, a method of unrolling a loop, comprising: preparing a first and a second vector instruction, wherein each vector instruction executes an iteration through the loop and wherein each vector instruction requires calculation of a scalar loop value; dispatching the first and second vector instructions from the scalar processing unit to a scalar instruction queue in the scalar processing unit and to the predispatch queue in the vector processing unit, wherein dispatching includes sending the first and second vector instructions from the scalar processing unit to the predispatch queue in the vector processing unit even if all scalar operands are not ready; queueing up the first and second vector instructions received from the scalar processing unit in the predispatch queue; executing a portion of each vector instruction in the scalar processing unit, wherein executing a portion of each vector instruction in the scalar processing unit includes writing a scalar operand representing the scalar loop value calculated for each vector instruction to a scalar operand queue; predispatching, within the vector processing unit, each vector instruction received from the scalar processing unit if all previously received vector instructions are scalar committed; predispatching, within the vector dispatch unit, each vector instruction from the predispatch queue to the dispatch queue in the order received, wherein predispatching includes determining if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed and transferring the vector instruction from the predispatch queue to the dispatch queue only if all scalar instructions issued prior to the vector instruction at the head of the predispatch queue are scalar committed; notifying the vector processing unit that the scalar operand is available in the scalar operand queue; dispatching the predispatched vector instruction from the dispatch queue of the vector processing unit if all required scalar operands are ready; and executing the first and second vector instructions dispatched from the dispatch queue of the vector processing unit, wherein executing the dispatched vector instruction in the vector processing unit includes reading the scalar operands associated with each instruction from the scalar operand queue.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.