IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0771931
(2007-06-29)
|
등록번호 |
US-7793073
(2010-09-27)
|
발명자
/ 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
Schwegman, Lundberg & Woessner, P.A.
|
인용정보 |
피인용 횟수 :
1 인용 특허 :
118 |
초록
▼
A method and apparatus to correctly compute a vector-gather, vector-operate (e.g., vector add), and vector-scatter sequence, particularly when elements of the vector may be redundantly presented, as with indirectly addressed vector operations. For an add operation, one vector register is loaded with
A method and apparatus to correctly compute a vector-gather, vector-operate (e.g., vector add), and vector-scatter sequence, particularly when elements of the vector may be redundantly presented, as with indirectly addressed vector operations. For an add operation, one vector register is loaded with the “add-in” values, and another vector register is loaded with address values of “add to” elements to be gathered from memory into a third vector register. If the vector of address values has a plurality of elements that point to the same memory address, the algorithm should add all the “add in” values from elements corresponding to the elements having the duplicated addresses. An indirectly addressed load performs the “gather” operation to load the “add to” values. A vector add operation then adds corresponding elements from the “add in” vector to the “add to” vector. An indirectly addressed store then performs the “scatter” operation to store the results.
대표청구항
▼
What is claimed is: 1. A method of identifying duplicate values in a vector register, the method comprising: loading addressing values into elements of a first vector register, wherein each of the addressing values is added to a first base address of a first memory area to calculate a corresponding
What is claimed is: 1. A method of identifying duplicate values in a vector register, the method comprising: loading addressing values into elements of a first vector register, wherein each of the addressing values is added to a first base address of a first memory area to calculate a corresponding location within the first memory area; generating each respective address value for a sequence of addressed locations within a constrained memory area, wherein the constrained memory area includes 2N consecutive addresses, wherein the addressed locations within the constrained memory area are addressed using an N-bit value derived from each respective addressing value of the first vector register, and wherein the constrained memory area is separate from and does not overlap the first memory area; storing, into a second vector register, identifying data values that can be used to identify elements in the second vector register; storing the identifying data values in the second vector register to the constrained memory area using the generated sequence of respective address values; reading data values from the constrained memory area using the generated sequence of respective address values; and comparing the identifying data values in the second vector register to the data values read from the constrained memory area to identify duplicate values. 2. A computerized method comprising, in each of a plurality of processors including a first processor and a second processor: loading a first vector register with addressing values and a second vector register with operand values, wherein each of the addressing values is added to a first base address of a first memory area to calculate a corresponding location within the first memory area; identifying element addresses of the first vector register having a value that duplicates a value in another element address by: generating each respective address value for a sequence of addressed locations within a constrained memory area, wherein the constrained memory area includes 2N consecutive addresses, wherein the addressed locations within the constrained memory area are addressed using an N-bit value derived from each respective addressing value of the first vector register, and wherein the constrained memory area is separate from and does not overlap the first memory area; loading, into a third vector register, identifying data values that can be used to identify elements in the third vector register; storing the identifying data values in the third vector register to the constrained memory area using the generated sequence of respective address values; reading data values from the constrained memory area using the generated sequence of respective address values; and comparing the identifying data values in the third vector register to the data values read from the constrained memory area to identify duplicate values; and selectively adding certain elements of the second vector register based on the element addresses in the first vector register having the duplicated values. 3. The method of claim 2, wherein loading a first vector register with addressing values and a second vector register operand values, identifying element addresses of the first vector register, and selectively adding certain elements of the second vector register are performed in parallel in the plurality of processors. 4. A system comprising: a first vector register having addressing values; a second vector register having operand values; circuitry programmed to determine which, if any, element addresses of the first vector register have a value that duplicates a value in another element address, the circuitry is configured to: generate each respective address value for a sequence of addressed locations within a constrained memory area, wherein the constrained memory area includes 2N consecutive addresses, wherein the addressed locations within the constrained memory area are addressed using an N-bit value derived from each respective addressing value of the first vector register, and wherein the constrained memory area is separate from and does not overlap the first memory area; load, into a third vector register, identifying data values that can be used to identify elements in the third vector register; store the identifying data values in the third vector register to the constrained memory area using the generated sequence of respective address values; read data values from the constrained memory area using the generated sequence of respective address values; and compare the identifying data values in the third vector register to the data values read from the constrained memory area to identify duplicate values; and circuitry programmed to selectively add certain elements of the second vector register based on the element addresses in the first vector register having the duplicated values. 5. The system of claim 4, further comprising: a plurality of processors, each processor including an instance of each of: the first, second and third vector registers; the circuitry programmed to determine which, if any, element addresses of the first vector register have a value that duplicates a value in another element address; the circuitry programmed to selectively add certain elements of the second vector register based on the element addresses in the first vector register having the duplicated values. 6. A system comprising: a plurality of processors, one or more of which includes: means for loading addressing values into elements of a first vector register, wherein each of the addressing values is added to a first base address of a first memory area to calculate a corresponding location within the first memory area; and means for determining which, if any, element addresses of the first vector register have a value that duplicates a value in another element address of the first vector register, wherein the means for determining is configured to: generate each respective address value for a sequence of addressed locations within a constrained memory area, wherein the constrained memory area includes 2N consecutive addresses, wherein the addressed locations within the constrained memory area are addressed using an N-bit value derived from each respective addressing value of the first vector register, and wherein the constrained memory area is separate from and does not overlap the first memory area; load, into a second vector register, identifying data values that can be used to identify elements in the second vector register; store the identifying data values in the second vector register to the constrained memory area using the generated sequence of respective address values; read data values from the constrained memory area using the generated sequence of respective address values; and compare the identifying data values in the second vector register to the data values read from the constrained memory area to identify duplicate values. 7. A non-transitory computer-readable medium having instructions stored thereon for causing a suitably programmed information-processing system to execute a method comprising: determining which, if any, element addresses of a first vector register have a value that duplicates a value in another element address; selectively adding certain elements of a second vector of operand values based on the element addresses of the duplicated values in the first vector register; loading, using addressing values from the first vector register, elements from memory into a third vector register; adding values from the third vector register and the second vector register to generate a result vector; and storing the result vector to memory using the addressing values from the first vector register; wherein the determining of duplicates includes: loading addressing values into elements of the first vector register, wherein each of the addressing values is added to a first base address of a first memory area to calculate a corresponding location within the first memory area; generating each respective address value for a sequence of addressed locations within a constrained memory area, wherein the constrained memory area includes 2N consecutive addresses, wherein the addressed locations within the constrained memory area are addressed using an N-bit value derived from each respective addressing value of the first vector register, and wherein the constrained memory area is separate from and does not overlap the first memory area; loading, into a fourth vector register, identifying data values that can be used to identify elements in the fourth vector register; storing the identifying data values in the fourth vector register to the constrained memory area using the generated sequence of respective address values; reading data values from the constrained memory area using the generated sequence of respective address values; and comparing the identifying data values in the fourth vector register to the data values read from the constrained memory area to identify duplicate values. 8. A method of performing mathematical operations on a vector register, comprising: loading a first vector register with addressing values; loading a second vector register with operand values; determining which, if any, element addresses of the first vector register have a value that duplicates a value in another element address, wherein determining includes: loading addressing values into elements of the first vector register, wherein each of the addressing values is added to a first base address of a first memory area to calculate a corresponding location within the first memory area; generating each respective address value for a sequence of addressed locations within a constrained memory area, wherein the constrained memory area includes 2N consecutive addresses, wherein the addressed locations within the constrained memory area are addressed using an N-bit value derived from each respective addressing value of the first vector register, wherein the constrained memory area is separate from and does not overlap the first memory area; loading, into a third vector register, identifying data values that can be used to identify elements in the third vector register; storing the identifying data values in the third vector register to the constrained memory area using the generated sequence of respective address values; reading data values from the constrained memory area using the generated sequence of respective address values; and comparing the identifying data values in the third vector register to the data values read from the constrained memory area to identify duplicate values; selectively performing a mathematical operation on certain elements of the second vector of operand values based on the element addresses of the duplicated values in the first vector register; loading, using addressing values from the first vector register, elements from memory into a fourth vector register; performing mathematical operations on elements from the fourth vector register and elements from the second vector register to generate a result vector; and storing the result vector to memory using the addressing values from the first vector register.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.