IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0703146
(2010-02-09)
|
등록번호 |
US-8539016
(2013-09-17)
|
발명자
/ 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
6 인용 특허 :
290 |
초록
▼
Circuitry speeds up the QR decomposition of a matrix. The circuitry can be provided in a fixed logic device, or can be configured into a programmable integrated circuit device such as a programmable logic device. This implementation performs Gram-Schmidt orthogonalization with no dependencies betwee
Circuitry speeds up the QR decomposition of a matrix. The circuitry can be provided in a fixed logic device, or can be configured into a programmable integrated circuit device such as a programmable logic device. This implementation performs Gram-Schmidt orthogonalization with no dependencies between iterations. QR decomposition of a matrix can be performed by processing entire columns at once as a vector operation. Data dependencies within and between matrix columns are removed, as later functions dependent on an earlier result may be generated from partial results somewhere in the datapath, rather than from an earlier completed result. Different passes through the matrix are timed so that different computations requiring the same functional units arrive at different time slots. After the Q matrix has been calculated, the R matrix may be calculated from the Q matrix by taking its transpose and multiplying the transpose by the original input matrix.
대표청구항
▼
1. Circuitry for performing QR decomposition of an input matrix, said circuitry comprising: a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each havi
1. Circuitry for performing QR decomposition of an input matrix, said circuitry comprising: a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; anda second datapath for performing an inverse square root operation and multiplication operations on output of said summer of said first datapath and on said inverse square root operation; wherein:on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column, and multiplies a square of said inverse norm by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; andon a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations. 2. The circuitry of claim 1 further comprising: a first input memory for said input matrix; wherein:output of said adders on said second pass are written into said first input memory to modify said input matrix in said first input memory as an orthogonal matrix in QR decomposition of said input matrix. 3. The circuitry of claim 2 further comprising: a second input memory for said input matrix; wherein:after said second pass, said first datapath multiplies said input matrix in said second input memory by a transpose of said orthogonal matrix in said first input memory. 4. The circuitry of claim 3 wherein: each of said first and second input memories comprises a plurality of row memories; andcolumns of a respective matrix in one of said first and second input memories are read by reading corresponding elements from each row memory in said respective one of said first and second input memories. 5. The circuitry of claim 4 wherein rows of said transpose of said orthogonal matrix are read by reading columns of said orthogonal matrix. 6. The circuitry of claim 1 further comprising multiplexer circuitry for selecting inputs for said first datapath on said first and second passes. 7. The circuitry of claim 1 further comprising storage for latching an inner product of said one column with itself at an input to said inverse square root operation. 8. The circuitry of claim 1 further comprising storage for latching said first column at an input to said first datapath. 9. A method of configuring a programmable integrated circuit device as circuitry for performing QR decomposition of an input matrix, said method comprising: configuring logic of said programmable integrated circuit device as a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; andconfiguring logic of said programmable integrated circuit device as a second datapath for performing inverse square root operation and multiplication operation on output of said summer of said first datapath; wherein: on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column, and multiplies a square of said inverse norm of said first column by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; andon a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations. 10. The method of claim 9 further comprising: configuring a first input memory for said input matrix; andconfiguring output of said adders for writing on said second pass into said first input memory to modify said input matrix in said first input memory as an orthogonal matrix in QR decomposition of said input matrix. 11. The method of claim 10 further comprising: configuring a second input memory for said input matrix; andconfiguring said first datapath to multiply, after said second pass, said input matrix in said second input memory by a transpose of said orthogonal matrix in said first input memory. 12. The method of claim 11 comprising: configuring each of said first and second input memories as a plurality of row memories; wherein:columns of a respective matrix in one of said first and second input memories are read by reading corresponding elements from each row memory in said respective one of said first and second input memories. 13. The method of claim 9 further comprising configuring multiplexer circuitry for selecting inputs for said first datapath on said first and second passes. 14. The method of claim 9 further comprising configuring storage for latching an inner product of said one column with itself at an input to said inverse square root operation. 15. The method of claim 9 further comprising configuring storage for latching said first column at an input to said first datapath. 16. A programmable integrated circuit device comprising: logic configurable as a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; andlogic configurable as a second datapath for performing inverse square root operation and multiplication operation on output of said summer of said first datapath; wherein, when said logic configurable as a first datapath and said logic configurable as a second datapath are configured, respectively, as said first and second datapath:on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column, and multiplies a square of said inverse norm of said first column by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; andon a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations. 17. The programmable integrated circuit device of claim 16 further comprising: memory configurable as a first input memory for said input matrix; andlogic configurable to write output of said adders on said second pass into said first input memory to modify said input matrix in said first input memory as an orthogonal matrix in QR decomposition of said input matrix. 18. The programmable integrated circuit device of claim 17 further comprising: memory configurable as a second input memory for said input matrix; wherein:said first datapath is configurable to multiply, after said second pass, said input matrix in said second input memory by a transpose of said orthogonal matrix in said first input memory. 19. The programmable integrated circuit device of claim 18 comprising: a plurality of row memories configurable as rows of each of said first and second input memories; wherein:columns of a respective matrix in one of said first and second input memories are read by reading corresponding elements from each row memory in said respective one of said first and second input memories. 20. The programmable integrated circuit device of claim 16 further comprising multiplexer circuitry configurable for selecting inputs for said first datapath on said first and second passes. 21. The programmable integrated circuit device of claim 16 further comprising storage configurable for latching an inner product of said one column with itself at an input to said inverse square root operation. 22. The programmable integrated circuit device of claim 16 further comprising storage configurable for latching said first column at an input to said first datapath. 23. A machine-readable data storage medium encoded with machine-executable instructions for configuring a programmable integrated circuit device as circuitry for performing QR decomposition of an input matrix, said instructions comprising: instructions to configure logic of said programmable integrated circuit device as a first datapath for performing multiplication and addition operations on columns of said input matrix, said first datapath comprising a plurality of multipliers, a corresponding plurality of adders each having an input connected to an output of one of said multipliers, and a summer having inputs connected to said outputs of said multipliers; andinstructions to configure logic of said programmable integrated circuit device as a second datapath for performing inverse square root operation and multiplication operation on output of said summer of said first datapath; wherein:on a first pass, said first datapath computes respective inner products of one column of said input matrix with each column of said input matrix, and said second datapath computes an inverse norm of said first column and multiplies said inverse norm of said first column by respective inner products of said one column with each other column of said input matrix to form respective norm combinations; andon a second pass, said adders of said first datapath compute a respective difference between each said other column and a product of said one column and a respective one of said norm combinations. 24. The machine-readable data storage medium of claim 23 further comprising: instructions to configure a first input memory for said input matrix; andinstructions to configure output of said adders for writing on said second pass into said first input memory to modify said input matrix in said first input memory as an orthogonal matrix in QR decomposition of said input matrix. 25. The machine-readable data storage medium of claim 24 further comprising: instructions to configure a second input memory for said input matrix; andinstructions to configure said first datapath to multiply, after said second pass, said input matrix in said second input memory by a transpose of said orthogonal matrix in said first input memory. 26. The machine-readable data storage medium of claim 25 comprising: instructions to configure each of said first and second input memories as a plurality of row memories; wherein:columns of a respective matrix in one of said first and second input memories are read by reading corresponding elements from each row memory in said respective one of said first and second input memories. 27. The machine-readable data storage medium of claim 23 further comprising instructions to configure multiplexer circuitry for selecting inputs for said first datapath on said first and second passes. 28. The machine-readable data storage medium of claim 23 further comprising instructions to configure storage for latching an inner product of said one column with itself at an input to said inverse square root operation. 29. The machine-readable data storage medium of claim 23 further comprising instructions to configure storage for latching an inverse norm of said one column at an input to said first datapath.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.