System and method for using a mask register to track progress of gathering and scattering elements between data registers and memory
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-009/312
G06F-015/80
G06F-009/30
G06F-009/38
G06F-009/345
G06F-012/02
G06F-012/0875
출원번호
US-0541458
(2014-11-14)
등록번호
US-10042814
(2018-08-07)
발명자
/ 주소
Sprangle, Eric
Rohillah, Anwar
Cavin, Robert
Forsyth, Andrew T.
Abrash, Michael
출원인 / 주소
Intel Corporation
대리인 / 주소
Nicholson De Vos Webster & Elliott LLP
인용정보
피인용 횟수 :
0인용 특허 :
43
초록▼
A device, system and method for assigning values to elements in a first register, where each data field in a first register corresponds to a data element to be written into a second register, and where for each data field in the first register, a first value may indicate that the corresponding data
A device, system and method for assigning values to elements in a first register, where each data field in a first register corresponds to a data element to be written into a second register, and where for each data field in the first register, a first value may indicate that the corresponding data element has not been written into the second register and a second value indicates that the corresponding data element has been written into the second register, reading the values of each of the data fields in the first register, and for each data field in the first register having the first value, gathering the corresponding data element and writing the corresponding data element into the second register, and changing the value of the data field in the first register from the first value to the second value. Other embodiments are described and claimed.
대표청구항▼
1. A method comprising: assigning values to data fields in a first register, wherein each of the data fields in the first register corresponds to an offset for a data element to be gathered, or not to be gathered, from a memory, and wherein for each of the data fields in the first register, a first
1. A method comprising: assigning values to data fields in a first register, wherein each of the data fields in the first register corresponds to an offset for a data element to be gathered, or not to be gathered, from a memory, and wherein for each of the data fields in the first register, a first value indicates that the data element at the corresponding offset in the memory still needs to be gathered and a second value indicates that the data element at the corresponding offset in the memory no longer needs to be gathered;reading the values of each of the data fields in the first register; andfor each of the data fields in the first register having the first value, gathering a data element at the corresponding offset in the memory and changing the value of the data field in the first register from the first value to the second value. 2. The method of claim 1, wherein for each of the data fields in the first register having the first value, a data element is gathered from the corresponding offset in the memory and is written into a corresponding data field in a second register. 3. The method of claim 2, wherein for each of the data fields in the first register having the first value, the corresponding offset is read from a corresponding data field in a third register. 4. The method of claim 1, wherein for each of the data fields in the first register having the first value, a data element is gather prefetched into a cache memory from the corresponding offset in the memory. 5. The method of claim 1, wherein data elements are gathered from corresponding offsets in the memory for each of the data fields in the first register having the first value, according to an iterative gather step to implement a full vector gather function. 6. The method of claim 5, wherein the first value is one and the second value is zero. 7. A computer-implemented method comprising: assigning values to data fields in a first register, wherein each of the data fields in the first register corresponds to an offset for a corresponding data element to be scattered, or not to be scattered, to a memory, and wherein for each of the data fields in the first register, a first value indicates that the corresponding data element still needs to be scattered to the corresponding offset in the memory and a second value indicates that the corresponding data element no longer needs to be scattered to the corresponding offset in the memory;reading the values of each of the data fields in the first register; andfor each of the data fields in the first register having the first value, scattering a data element to the corresponding offset in the memory and changing the value of the data field in the first register from the first value to the second value. 8. The method of claim 7, wherein for each of the data fields in the first register having the first value, a corresponding data element from a corresponding data field in a second register is scattered by writing the corresponding data element from the corresponding data field in the second register to the corresponding offset in the memory. 9. The method of claim 8, wherein for each of the data fields in the first register having the first value, the corresponding offset is read from a corresponding data field in a third register. 10. The method of claim 7, wherein data elements are scattered to corresponding offsets in the memory for each of the data fields in the first register having the first value, according to an iterative scatter step to implement a full vector scatter function. 11. The method of claim 10, wherein the first value is one and the second value is zero. 12. The method of claim 11, wherein for each of the data fields in the first register having the first value, a corresponding data element is first prefetched into the corresponding offset in a cache memory from the corresponding offset in a memory. 13. A system comprising: a memory; anda processor coupled with the memory, the processor having: a first register comprising a plurality of data fields, wherein each of the plurality of data fields in the first register corresponds to an offset for a data element to be gathered, or not to be gathered, from the memory, and wherein for values stored in each of the data fields in the first register, a first stored value indicates that the data element at the corresponding offset in the memory still needs to be gathered and a second stored value indicates that the data element at the corresponding offset in the memory no longer needs to be gathered; andone or more execution units to:read the values stored in each of the data fields in the first register; andfor each of the data fields in the first register having the first value, gather a data element at the corresponding offset in the memory, and change the value of the data field in the first register from the first value to the second value. 14. The system of claim 13, wherein the first value is one and the second value is zero. 15. The system of claim 13, wherein data elements are gathered from corresponding offsets in the memory for each of the data fields in the first register having the first value, according to an iterative gather step to implement a full vector gather function. 16. The system of claim 15, wherein for each of the data fields in the first register having the first value, a data element is gathered from the corresponding offset in the memory and is written into a corresponding data field in a second register. 17. The system of claim 16, wherein for each of the data fields in the first register having the first value, the corresponding offset is read from a corresponding data field in a third register. 18. A system comprising: a memory; anda processor coupled with the memory, the processor having: a first register comprising a plurality of data fields, wherein each of the plurality of data fields in the first register corresponds to an offset for a data element to be scattered, or not to be scattered, to the memory, and wherein for values stored in each of the data fields in the first register, a first stored value indicates that the corresponding data element still needs to be scattered to the corresponding offset in the memory and a second stored value indicates that the corresponding data element no longer needs to be scattered to the corresponding offset in the memory; andone or more execution units to:read the values stored in each of the data fields in the first register; andfor each of the data fields in the first register having the first value, scatter a data element to the corresponding offset in the memory, and change the value of the data field in the first register from the first value to the second value. 19. The system of claim 18, wherein data elements are scattered to corresponding offsets in the memory for each of the data fields in the first register having the first value, according to an iterative scatter step to implement a full vector scatter function. 20. The system of claim 19, wherein for each of the data fields in the first register having the first value, a corresponding data element from a corresponding data field in a second register is scattered by writing the corresponding data element from the corresponding data field in the second register to the corresponding offset in the memory. 21. The system of claim 20, wherein the first value is one and the second value is zero.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (43)
Norbert Juffa ; Stephan Meier ; Stuart Oberman ; Scott White, Apparatus and method for executing floating-point store instructions in a microprocessor.
Moyer William C. (Dripping Springs TX) Arends John H. (Austin TX) White Christopher E. (Austin TX) Diefendorff Keith E. (Austin TX), Apparatus and method for optimizing performance of a cache memory in a data processing system.
Matsuo Masahito,JPX ; Shimizu Toru,JPX ; Yoshida Toyohiko,JPX, Data processing system capable of executing groups of instructions, including at least one arithmetic instruction, in parallel.
Scales ; III Hunter Ledbetter ; Diefendorff Keith Everett ; Olsson Brett ; Dubey Pradeep Kumar ; Hochsprung Ronald Ray ; Beavers Bradford Byron ; Burgess Bradley G. ; Snyder Michael Dean ; May Cathy , Data processing system for processing vector data and method therefor.
Prabhu, J. Arjun; Priest, Douglas M., Exception handling for SIMD floating point-instructions using a floating point status register to report exceptions.
Schwarz Eric Mark ; Krygowski Christopher A. ; Slegel Timothy John ; McManigal David Frazelle ; Farrell Mark Steven, IEEE compliant floating point unit.
Talcott, Adam R.; Liebholz, Daniel L.; Patel, Sanjay; Larson, Richard H., Mechanism for delivering precise exceptions in an out-of-order processor with speculative execution.
Auslander Marc A. (Millwood NY) Cocke John (Bedford NY) Hao Hsieh T. (Chappaqua NY) Markstein Peter W. (Yorktown Heights NY) Radin George (Piermont NY), Mechanism for implementing one machine cycle executable trap instructions in a primitive instruction set computing syste.
Beard Douglas R. (Eleva WI) Phelps Andrew E. (Eau Claire WI) Woodmansee Michael A. (Eau Claire WI) Blewett Richard G. (Altoona WI) Lohman Jeffrey A. (Eau Claire WI) Silbey Alexander A. (Eau Claire WI, Method and apparatus for chaining vector instructions.
Fossum Tryggve (Northboro MA) Hetherington Ricky C. (Northboro MA) Fite ; Jr. David B. (Northboro MA) Manley Dwight P. (Holliston MA) McKeen Francis X. (Westboro MA) Murray John E. (Acton MA), Method and apparatus using a cache and main memory for both vector processing and scalar processing by prefetching cache.
Divivier Robert James (San Jose CA) Nemirovsky Mario (San Jose CA), Pipelined processor with two tier prefetch buffer structure and method with bypass.
Sprangle, Eric; Rohillah, Anwar; Cavin, Robert; Forsyth, Tom; Abrash, Michael, Processor and system using a mask register to track progress of gathering and prefetching elements from memory.
Shen Gene W. (Mountain View CA) Szeto John (Oakland CA) Shebanow Michael C. (Plano TX), Processor structure and method for tracking floating-point exceptions.
Tremblay, Marc; Chan, Jeffrey Meng Wah; Sudharsanan, Subramania; Yeluri, Sharada; Pan, Biyu, Sending both a load instruction and retrieved data from a load buffer to an annex prior to forwarding the load data to register file.
Tran Thang M. ; Pickett James K. ; Mahalingaiah Rupaka, Speculative register storage for storing speculative results corresponding to register updated by a plurality of concurr.
Sprangle, Eric; Rohillah, Anwar; Cavin, Robert; Forsyth, Tom; Abrash, Michael, System and method for using a mask register to track progress of gathering elements from memory.
Nishikawa Takeshi (Tokyo JPX) Isobe Yoko (Yamanashi JPX), Vector processing device using address data and mask information to generate signal that indicates which addresses are t.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.