IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0549418
(2006-10-13)
|
등록번호 |
US-7509459
(2009-03-24)
|
발명자
/ 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
Sterne, Kessler, Goldstein & Fox PLLC
|
인용정보 |
피인용 횟수 :
4 인용 특허 :
20 |
초록
▼
A microprocessor has a plurality of stream prefetch engines for prefetching a respective data stream from the system memory into the microprocessor cache memory and an instruction decoder that decodes instructions of the microprocessor instruction set. The instruction set includes a stream prefetch
A microprocessor has a plurality of stream prefetch engines for prefetching a respective data stream from the system memory into the microprocessor cache memory and an instruction decoder that decodes instructions of the microprocessor instruction set. The instruction set includes a stream prefetch instruction that returns an identifier uniquely associating a data stream specified by the instruction with one of the engines. The instruction set also includes an explicit prefetch-triggering load instruction that specifies a stream identifier returned by a previously executed stream prefetch instruction. When the decoder decodes a conventional load instruction it does not prefetch; however, when it decodes an explicit prefetch-triggering load instruction it commences prefetching the specified data stream. In one embodiment, an indicator of the load instruction may explicitly specify non-prefetch-triggering. In another embodiment one stream prefetch engine is implicitly associated and the other engines are explicitly associated by the returned identifier.
대표청구항
▼
I claim: 1. A microprocessor coupled to a system memory, comprising: a plurality of stream prefetch engines, each configured to prefetch a respective data stream from the system memory into a cache memory of the microprocessor; and an instruction decoder, coupled to said plurality of stream prefetc
I claim: 1. A microprocessor coupled to a system memory, comprising: a plurality of stream prefetch engines, each configured to prefetch a respective data stream from the system memory into a cache memory of the microprocessor; and an instruction decoder, coupled to said plurality of stream prefetch engines, configured to decode instructions of an instruction set of the microprocessor, said instruction set comprising: a stream prefetch instruction, that specifies a data stream and a fetch-ahead distance and returns a stream identifier that uniquely associates said specified data stream with one of said plurality of stream prefetch engines; and a load instruction, that specifies an address of data to be read from the system memory into the microprocessor, and that further specifies a stream identifier returned from a previous execution of one of said stream prefetch instructions; wherein each stream prefetch engine is configured to prefetch a portion of said respective data stream in response to said instruction decoder decoding one of said load instructions whose stream identifier identifies said stream prefetch engine, and wherein each stream prefetch engine is further configured to suspend prefetching of said data stream once a difference between a current prefetch address associated with said respective stream prefetch engine and said address specified by the load instruction is greater than said fetch-ahead distance, and is further configured to resume prefetching of said data stream once said difference is less than said fetch-ahead distance. 2. The microprocessor as recited in claim 1, wherein said instruction set further comprises: a store instruction, that specifies an address of data to be written from the microprocessor into the system memory, and that further specifies a stream identifier for identifying one of said plurality of stream prefetch engines returned from a previous execution of one of said stream prefetch instructions; wherein each one of said plurality of stream prefetch engines is configured to prefetch a portion of said respective data stream in response to said instruction decoder decoding one of said store instruction whose stream identifier identifies said one of said plurality of stream prefetch engines. 3. The microprocessor as recited in claim 1, further comprising: a stream engine allocator, coupled to said instruction decoder and to said plurality of stream prefetch engines, configured to allocate one of said plurality of stream prefetch engines uniquely associated with said identifier returned by said stream prefetch instruction. 4. The microprocessor as recited in claim 1, wherein said stream prefetch instruction further specifies a hysteresis amount, wherein when said one of said stream prefetch engines resumes prefetching said respective data stream, said one of said stream prefetch engines prefetches at least said hysteresis amount of data before suspending prefetching again. 5. The microprocessor as recited in claim 1, wherein said stream prefetch instruction further specifies a locality characteristic, wherein each one of said stream prefetch engines prefetches said respective data stream based on said locality characteristic. 6. The microprocessor as recited in claim 1, further comprising: a first translation look-aside buffer (TLB), coupled to a load/store unit of the microprocessor, configured to cache virtual to physical page address translations of load/store requests generated by said load/store unit; and a second TLB, coupled to said plurality of stream prefetch engines, configured to cache virtual to physical page address translations of prefetch requests generated by said plurality of stream prefetch engines. 7. The microprocessor as recited in claim 1, wherein said stream prefetch instruction further specifies a prefetch priority, wherein the microprocessor further comprises: a bus interface unit (BIU), coupled to said plurality of stream prefetch engines, configured to generate transaction requests on a bus coupling the microprocessor to the system memory to transfer data between the system memory and the microprocessor in response to requests generated by a load/store unit of the microprocessor and by said plurality of stream prefetch engines, wherein said BIU prioritizes said bus transaction requests for prefetching relative to said bus transaction requests for said load instructions based on said prefetch priority. 8. The microprocessor as recited in claim 1, wherein said stream prefetch instruction further specifies an abnormal TLB access policy, wherein the microprocessor further comprises: a memory subsystem, comprising said cache memory and having a translation look-aside buffer (TLB), coupled to said stream prefetch unit, configured to selectively abort prefetching said respective data stream based on said abnormal TLB access policy in response to an abnormal TLB access caused by prefetching of said respective data stream. 9. A method for prefetching a data stream into a microprocessor, comprising: decoding a stream prefetch instruction of an instruction set of the microprocessor, said stream prefetch instruction specifying a data stream and a fetch-ahead distance; returning a stream identifier that uniquely associates said specified data stream with one of a plurality of stream prefetch engines of the microprocessor, in response to said decoding said stream prefetch instruction; decoding a load instruction of the instruction set, after said returning said stream identifier, said load instruction specifying an address of data to be read from the system memory into the microprocessor, and further specifying said returned stream identifier; prefetching a portion of said data stream, by one of said plurality of stream prefetch engines, in response to said decoding said load instruction whose stream identifier identifies said one of said plurality of stream prefetch engines; suspending said prefetching if a difference between a current prefetch address and said address specified by said load instruction is greater than said fetch-ahead distance; and resuming said prefetching if said difference is less than said fetch-ahead distance. 10. The method as recited in claim 9, further comprising: decoding a store instruction of the instruction set, after said returning said stream identifier, said store instruction specifying an address of data to be written to the system memory from the microprocessor, and further specifying said returned stream identifier; and prefetching a portion of said data stream, by one of said plurality of stream prefetch engines, in response to said decoding said store instruction whose stream identifier identifies said one of said plurality of stream prefetch engines. 11. The method as recited in claim 9, further comprising: allocating one of said plurality of stream prefetch engines uniquely associated with said stream returned identifier, in response to said decoding said stream prefetch instruction. 12. The method as recited in claim 9, wherein said stream prefetch instruction further specifies a hysteresis amount, wherein the method further comprises: resuming prefetching at least said hysteresis amount of data before suspending prefetching again. 13. The method as recited in claim 9, wherein said stream prefetch instruction further specifies a locality characteristic, wherein said prefetching comprises prefetching said data stream based on said locality characteristic. 14. The method as recited in claim 9, wherein said stream prefetch instruction further specifies a prefetch priority, wherein the method further comprises: generating a first one or more bus transaction requests on a bus coupling the microprocessor to the system memory to transfer data between the system memory and the microprocessor in response to said load instructions; generating a second one or more bus transaction requests on said bus to prefetch said portion of said data stream from the system memory to the microprocessor; and prioritizing said second one or more bus transaction requests relative to said first one or more bus transaction requests based on said prefetch priority. 15. The method as recited in claim 9, further comprising: caching, in a first translation look-aside buffer (TLB), virtual to physical page address translations of said address specified by said load instructions of said data to be read from the system memory; and caching, in a second TLB, virtual to physical page address translations of prefetch addresses of said data stream specified by said stream prefetch instructions. 16. The method as recited in claim 9, wherein said stream prefetch instruction further specifies an abnormal TLB access policy, wherein the method further comprises: selectively aborting said prefetching based on said abnormal TLB access policy in response to an abnormal TLB access caused by said prefetching of said data stream. 17. A computer program product for use with a computing device, the computer program product comprising: a tangible computer usable storage medium, having computer readable program code embodied thereon, for providing a microprocessor coupled to a system memory, said computer readable program code comprising: first computer readable program code for providing a plurality of stream prefetch engines, each configured to prefetch a respective data stream from the system memory into a cache memory of the microprocessor; and second computer readable program code for providing an instruction decoder, coupled to said plurality of stream prefetch engines, configured to decode instructions of an instruction set of the microprocessor, said instruction set comprising: a stream prefetch instruction, that specifies a data stream and a fetch-ahead distance and returns a stream identifier that uniquely associates said specified data stream with one of said plurality of stream prefetch engines; and a load instruction, that specifies an address of data to be read from the system memory into the microprocessor, and that further specifies a stream identifier returned from a previous execution of one of said stream prefetch instructions; wherein each stream prefetch engine is configured to prefetch a portion of said respective data stream in response to said instruction decoder decoding one of said load instruction whose stream identifier identifies said stream prefetch engine, and wherein each stream prefetch engine is further configured to suspend prefetching of said data stream once a difference between a current prefetch address associated with said stream prefetch engine and said address specified by the load instruction is greater than said fetch-ahead distance, and is further configured to resume prefetching of said data stream once said difference is less than said fetch-ahead distance. 18. The computer program product as recited in claim 17, wherein said instruction set further comprises: a store instruction, that specifies an address of data to be written from the microprocessor into the system memory, and that further specifies a stream identifier for identifying one of said plurality of stream prefetch engines returned from a previous execution of one of said stream prefetch instructions; wherein each one of said plurality of stream prefetch engines is configured to prefetch a portion of said respective data stream in response to said instruction decoder decoding one of said store instruction whose stream identifier identifies said one of said plurality of stream prefetch engines. 19. A microprocessor coupled to a system memory, comprising: a plurality of stream prefetch engines, each configured to prefetch a respective data stream from the system memory into a cache memory of the microprocessor; and an instruction decoder, coupled to said plurality of stream prefetch engines, configured to decode instructions of an instruction set of the microprocessor, said instruction set comprising: a stream prefetch instruction, that specifies a data stream and a fetch-ahead distance and returns a stream identifier that uniquely associates said specified data stream with one of said plurality of stream prefetch engines; and a load instruction, that specifies an address of data to be read from the system memory into the microprocessor, and that further specifies a stream identifier for optionally identifying one of said plurality of stream prefetch engines returned from a previous execution of one of said stream prefetch instructions; wherein each stream prefetch engine is configured to selectively prefetch a portion of said respective data stream in response to said instruction decoder decoding one of said load instructions only if said stream identifier identifies said stream prefetch engine, and wherein each stream prefetch engine is further configured to suspend prefetching of said data stream once a difference between a current prefetch address associated with said stream prefetch engine and said address specified by the load instruction is greater than said fetch-ahead distance, and is further configured to resume prefetching of said data stream once said difference is less than said fetch-ahead distance. 20. The microprocessor as recited in claim 19, wherein said instruction set further comprises: a store instruction, that specifies an address of data to be written to the system memory from the microprocessor, and that further specifies a stream identifier for optionally identifying one of said plurality of stream prefetch engines returned from a previous execution of one of said stream prefetch instructions; wherein each one of said plurality of stream prefetch engines is configured to prefetch a portion of said respective data stream in response to said instruction decoder decoding one of said store instruction only if said stream identifier identifies said one of said plurality of stream prefetch engines. 21. A method for prefetching a data stream into a microprocessor, comprising: decoding a stream prefetch instruction of an instruction set of the microprocessor, said stream prefetch instruction specifying a data stream and a fetch-ahead distance; returning a stream identifier that uniquely associates said specified data stream with one of a plurality of stream prefetch engines of the microprocessor, in response to said decoding said stream prefetch instruction; decoding a first instance of a load instruction of the instruction set, said first instance of said load instruction specifying an address of data to be read from a system memory into the microprocessor, and further specifying that said first instance of said load instruction is a non-prefetch-triggering instruction; decoding a second instance of said load instruction of the instruction set, after said returning said stream identifier, said second instance of said load instruction specifying an address of data to be read from the system memory into the microprocessor, and further specifying that said second instance of said load instruction is a prefetch-triggering instruction, and further specifying said returned stream identifier; prefetching a portion of said data stream, by one of said plurality of stream prefetch engines, in response to said decoding said second instance of said load instruction whose stream identifier identifies said one of said plurality of stream prefetch engines; refraining from prefetching, by said plurality of stream prefetch engines, in response to said decoding said first instance of said load instruction; suspending said prefetching if a difference between a current prefetch address and said address specified by said second instance of said load instruction is greater than said fetch-ahead distance; and resuming said prefetching if said difference is less than said fetch-ahead distance. 22. The method as recited in claim 21, further comprising: decoding a first instance of a store instruction of the instruction set, said first instance of said store instruction specifying an address of data to be written to a system memory from the microprocessor, and further specifying that said first instance of said store instruction is a non-prefetch-triggering instruction; decoding a second instance of said store instruction of the instruction set, after said returning said stream identifier, said second instance of said store instruction specifying an address of data to be written to the system memory from the microprocessor, and further specifying that said second instance of said store instruction is a prefetch-triggering instruction, and further specifying said returned stream identifier; prefetching a portion of said data stream, by one of said plurality of stream prefetch engines, in response to said decoding said second instance of said store instruction whose stream identifier identifies said one of said plurality of stream prefetch engines; and refraining from prefetching, by said plurality of stream prefetch engines, in response to said decoding said first instance of said store instruction. 23. A computer program product for use with a computing device, the computer program product comprising: a tangible computer usable storage medium, having computer readable program code embodied thereon, for providing a microprocessor coupled to a system memory, said computer readable program code comprising: first computer readable program code for providing a plurality of stream prefetch engines, each configured to prefetch a respective data stream from the system memory into a cache memory of the microprocessor; and second computer readable program code for providing an instruction decoder, coupled to said plurality of stream prefetch engines, configured to decode instructions of an instruction set of the microprocessor, said instruction set comprising: a stream prefetch instruction, that specifies a data stream and a fetch-ahead distance and returns a stream identifier that uniquely associates said specified data stream with one of said plurality of stream prefetch engines; and a load instruction, that specifies an address of data to be read from the system memory into the microprocessor, and that further specifies a stream identifier for optionally identifying one of said plurality of stream prefetch engines returned from a previous execution of one of said stream prefetch instructions; wherein each stream prefetch engine is configured to selectively prefetch a portion of said respective data stream in response to said instruction decoder decoding one of said load instructions only if said stream identifier identifies said stream prefetch engine, and wherein each stream prefetch engine is further configured to suspend prefetching of said data stream once a difference between a current prefetch address associated with said stream prefetch engine and said address specified by the load instruction is greater than said fetch-ahead distance, and is further configured to resume prefetching of said data stream once said difference is less than said fetch-ahead distance. 24. A microprocessor coupled to a system memory, comprising: a plurality of stream prefetch engines, each configured to prefetch a respective data stream from the system memory into a cache memory of the microprocessor; and an instruction decoder, coupled to said plurality of stream prefetch engines, configured to decode instructions of an instruction set of the microprocessor, said instruction set comprising: a stream prefetch instruction, that specifies a data stream and a fetch-ahead distance and returns a stream identifier that uniquely associates said specified data stream with one of said plurality of stream prefetch engines other than a predetermined one of said plurality of stream prefetch engines; a first load instruction, that specifies an address of data to be read from the system memory into the microprocessor, and that further implicitly identifies said predetermined one of said plurality of stream prefetch engines; and a second load instruction, that specifies an address of data to be read from the system memory into the microprocessor, and that further specifies a stream identifier for explicitly identifying one of said plurality of stream prefetch engines other than said predetermined one of said plurality of stream prefetch engines returned from a previous execution of one of said stream prefetch instructions; wherein each stream prefetch engine is configured to prefetch a portion of said respective data stream in response to said instruction decoder decoding one of said first and second load instructions that implicitly and explicitly, respectively, identifies said respective data stream, and wherein each stream prefetch engine is further configured to suspend prefetching of said data stream once a difference between a current prefetch address associated with said stream prefetch engine and said address specified by said one of said first and second load instructions is greater than said fetch-ahead distance, and is further configured to resume prefetching of said data stream once said difference is less than said fetch-ahead distance. 25. A microprocessor coupled to a system memory, comprising: a stream prefetch unit, comprising: at least one stream prefetch engine, configured to prefetch data from a data stream; and a stream hit detector, configured to detect when a memory address specified by a load instruction hits in said data stream; an instruction decoder, configured to decode instructions of an instruction set of the microprocessor, said instruction set comprising: a stream prefetch instruction, that specifies a data stream and a fetch-ahead distance; and a load instruction, that specifies an address of data to be read from the system memory into the microprocessor and including an indicator for indicating whether to refrain from triggering prefetching if said address hits in said data stream; wherein if said address hits in said data stream, in response to said instruction decoder decoding said load instruction, said stream prefetch engine is configured to prefetch a portion of said data stream from the system memory into the microprocessor if said indicator does not indicate to refrain from triggering prefetching, and to refrain from prefetching if said indicator indicates to refrain from triggering prefetching, and wherein said stream prefetch engine is further configured to suspend prefetching of said data stream once a difference between a current prefetch address associated with said stream prefetch engine and said address specified by said load instruction is greater than said fetch-ahead distance, and is further configured to resume prefetching of said data stream once said difference is less than said fetch-ahead distance. 26. The microprocessor as recited in claim 25, wherein said instruction set further comprises: a store instruction, that specifies an address of data to be written to the system memory from the microprocessor and including an indicator for indicating whether to refrain from triggering prefetching if said address hits in said data stream; wherein if said address hits in said data stream, in response to said instruction decoder decoding said store instruction, said stream prefetch engine is configured to prefetch a portion of said data stream from the system memory into the microprocessor if said indicator does not indicate to refrain from triggering prefetching, and to refrain from prefetching if said indicator indicates to refrain from triggering prefetching. 27. A method for prefetching a data stream into a microprocessor, comprising: decoding a stream prefetch instruction of an instruction set of the microprocessor, said stream prefetch instruction specifying a data stream and a fetch-ahead distance; decoding a load instruction, said load instruction specifying an address of data to be read from a system memory into the microprocessor, and including an indicator for indicating whether to refrain from triggering prefetching if said address hits in said data stream; if said address hits in said data stream, prefetching a portion of said data stream from said system memory into the microprocessor if said indicator does not indicate to refrain from triggering prefetching; if said address hits in said data stream, refraining from prefetching if said indicator indicates to refrain from triggering prefetching; suspending said prefetching if a difference between a current prefetch address and said address specified by said second instance of said load instruction is greater than said fetch-ahead distance; and resuming said prefetching if said difference is less than said fetch-ahead distance. 28. The method as recited in claim 27, further comprising: decoding a store instruction of the instruction set, said store instruction specifying an address of data to be written to the system memory from the microprocessor, and including an indicator for indicating whether to refrain from triggering prefetching if said address hits in said data stream; if said address in said store instruction hits in said data stream, prefetching a portion of said data stream from said system memory into the microprocessor if said indicator in said store instruction does not indicate to refrain from triggering prefetching; and if said address in said store instruction hits in said data stream, refraining from prefetching if said indicator in said store instruction indicates to refrain from triggering prefetching. 29. A computer program product for use with a computing device, the computer program product comprising: a tangible computer usable storage medium, having computer readable program code embodied thereon, for providing a microprocessor coupled to a system memory, said computer readable program code comprising: first computer readable program code for providing a stream prefetch unit, comprising: at least one stream prefetch engine, configured to prefetch data from a data stream; and a stream hit detector, configured to detect when a memory address specified by a load instruction hits in said data stream; and second computer readable program code for providing an instruction decoder, configured to decode instructions of an instruction set of the microprocessor, said instruction set comprising: a stream prefetch instruction, that specifies a data stream and a fetch-ahead distance; and a load instruction, that specifies an address of data to be read from the system memory into the microprocessor and including an indicator for indicating whether to refrain from triggering prefetching if said address hits in said data stream; wherein if said address hits in said data stream, in response to said instruction decoder decoding said load instruction, said stream prefetch engine is configured to prefetch a portion of said data stream from the system memory into the microprocessor if said indicator does not indicate to refrain from triggering prefetching, and to refrain from prefetching if said indicator indicates to refrain from triggering prefetching, and wherein said stream prefetch engine is further configured to suspend prefetching of said data stream once a difference between a current prefetch address associated with said stream prefetch engine and said address specified by said load instructions is greater than said fetch-ahead distance, and is further configured to resume prefetching of said data stream once said difference is less than said fetch-ahead distance.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.