IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0187132
(2011-07-20)
|
등록번호 |
US-9280342
(2016-03-08)
|
발명자
/ 주소 |
|
출원인 / 주소 |
- Oracle International Corporation
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
1 인용 특허 :
4 |
초록
▼
A processor, method, and medium for using vector operations to compress selected elements of a vector. An input vector is compared to a criteria vector, and then a subset of the plurality of elements of the input vector are selected based on the comparison. A permutation vector is generated based on
A processor, method, and medium for using vector operations to compress selected elements of a vector. An input vector is compared to a criteria vector, and then a subset of the plurality of elements of the input vector are selected based on the comparison. A permutation vector is generated based on the locations of the selected elements and then the permutation vector is used to permute the selected elements of the input vector to an output vector. The selected elements of the input vector are stored in contiguous locations in the leftmost elements of the output vector. Then, the output vector is stored to memory and a pointer to the memory location is incremented by the number of selected elements.
대표청구항
▼
1. A method comprising: fetching instructions and data from a memory;decoding the fetched instructions; andprocessing decoded vector instructions used to compress vectors in the fetched data by: performing a comparison of each of a plurality of elements of a source vector to given criteria to genera
1. A method comprising: fetching instructions and data from a memory;decoding the fetched instructions; andprocessing decoded vector instructions used to compress vectors in the fetched data by: performing a comparison of each of a plurality of elements of a source vector to given criteria to generate a result vector, wherein the result vector comprises a separate result of the comparison for each of the plurality of elements;performing a compressed select operation on the result vector to generate a permutation vector, wherein the permutation vector identifies one or more elements of the plurality of elements meeting the given criteria and identifies a permuted position for each of the one or more elements;performing a first permutation operation using said permutation vector on the plurality of elements of the source vector to generate an intermediate vector; andselecting the one or more elements of the plurality of elements in the intermediate vector meeting the given criteria to store in an output vector, wherein the output vector comprises only said one or more elements. 2. The method as recited in claim 1, wherein said comparison, storing, generating, and permutation are vector operations. 3. The method as recited in claim 2, wherein selecting said one or more elements comprises generating a mask vector different from the result vector and the permutation vector, wherein the mask vector comprises an indication for each of the one or more elements of the plurality of elements in the intermediate vector meeting the given criteria, wherein each of said one or more elements stores a same indication. 4. The method as recited in claim 3, further comprising: determining by using said mask a number of how many of the plurality of elements of the source vector meet said criteria; andincrementing a pointer by said number, wherein the pointer points to a storage location located immediately after said one or more elements in the output vector stored in contiguous storage locations. 5. The method as recited in claim 3, further comprising performing a second permutation operation using said permutation vector on the result vector to generate the mask vector. 6. The method as recited in claim 5, wherein determining said number comprises performing a population count vector operation on said mask. 7. The method as recited in claim 2, wherein said one or more elements stored in the output vector overwrites one or more elements of a previous intermediate vector that did not meet previous criteria. 8. A processor comprising: an input/output (I/O) buffer configured to fetch instructions and data from a memory;a control unit configured to decode the fetched instructions; anda floating-point and graphics unit (FGU) comprising: a vector unit for processing decoded vector instructions used to compress vectors in the fetched data; anda vector register file, wherein the vector register file is coupled to the vector unit;wherein to process decoded vector instructions, the vector unit is configured to: perform a comparison of each of a plurality of elements of a source vector to given criteria to generate a result vector, wherein the result vector comprises a separate result of the comparison for each of the plurality of elements;perform a compressed select operation on the result vector to generate a permutation vector, wherein the permutation vector identifies one or more elements of the plurality of elements meeting the given criteria and identifies a permuted position for each of the one or more elements;perform a first permutation operation using said permutation vector on the plurality of elements of the source vector to generate an intermediate vector; andselect the one or more elements of the plurality of elements in the intermediate vector meeting the given criteria to store in an output vector, wherein the output vector comprises only said one or more elements. 9. The processor as recited in claim 8, wherein said comparison, storing, generating, and permutation are vector operations. 10. The processor as recited in claim 9, wherein to select said one or more elements, the vector unit is further configured to generate a mask vector different from the result vector and the permutation vector, wherein the mask vector comprises an indication for each of the one or more elements of the plurality of elements in the intermediate vector meeting the given criteria, wherein each of said one or more elements stores a same indication. 11. The processor as recited in claim 10, wherein the vector unit is further configured to: determine by using said mask a number of how many of the plurality of elements of the source vector meet said criteria; andincrement a pointer by said number, wherein the pointer points to a storage location located immediately after said one or more elements in the output vector stored in contiguous storage locations. 12. The processor as recited in claim 10, wherein the vector unit is further configured to perform a second permutation operation using said permutation vector on the result vector to generate the mask vector. 13. The processor as recited in claim 12, wherein determining said number comprises performing a population count vector operation on said mask. 14. The processor as recited in claim 9, wherein said one or more elements stored in the output vector overwrites one or more elements of a previous intermediate vector that did not meet previous criteria. 15. A non-transitory computer readable storage medium comprising program instructions, wherein when executed the program instructions are operable to: fetch instructions and data from a memory;decode the fetched instructions; andwherein to process decoded vector instructions used to compress vectors in the fetched data: perform a comparison of each of a plurality of elements of a source vector to given criteria to generate a result vector, wherein the result vector comprises a separate result of the comparison for each of the plurality of elements;perform a compressed select operation on the result vector to generate a permutation vector, wherein the permutation vector identifies one or more elements of the plurality of elements meeting the given criteria and identifies a permuted position for each of the one or more elements;perform a first permutation operation using said permutation vector on the plurality of elements of the source vector to generate an intermediate vector; andselect the one or more elements of the plurality of elements in the intermediate vector meeting the given criteria to store in an output vector, wherein the output vector comprises only said one or more elements. 16. The non-transitory computer readable storage medium as recited in claim 15, wherein said comparison, storing, generating, and permutation are vector operations. 17. The non-transitory computer readable storage medium as recited in claim 16, wherein to select said one or more elements, the program instructions are further operable to generate a mask vector different from the result vector and the permutation vector, wherein the mask vector comprises an indication for each of the one or more elements of the plurality of elements in the intermediate vector meeting the given criteria, wherein each of said one or more elements stores a same indication. 18. The non-transitory computer readable storage medium as recited in claim 17, wherein the program instructions are further operable to: determine by using said mask a number of how many of the plurality of elements of the source vector meet said criteria; andincrement a pointer by said number, wherein the pointer points to a storage location located immediately after said one or more elements in the output vector stored in contiguous storage locations. 19. The non-transitory computer readable storage medium as recited in claim 18, wherein the program instructions are further operable to perform a second permutation operation using said permutation vector on the result vector to generate the mask vector. 20. The non-transitory computer readable storage medium as recited in claim 19, wherein determining said number comprises performing a population count vector operation on said mask.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.