Multiprocessor computer architecture incorporating a plurality of memory algorithm processors in the memory subsystem
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-009/312
출원번호
US-0339133
(2003-01-08)
발명자
/ 주소
Huppenthal, Jon M.
Leskar, Paul A.
출원인 / 주소
SRC Computers, Inc.
대리인 / 주소
Hogan &
인용정보
피인용 횟수 :
13인용 특허 :
14
초록▼
A multiprocessor computer architecture incorporating a plurality of programmable hardware memory algorithm processors (“MAP”) in the memory subsystem. The MAP may comprise one or more field programmable gate arrays (“FPGAs”) which function to perform identified algorithms in conjunction with, and ti
A multiprocessor computer architecture incorporating a plurality of programmable hardware memory algorithm processors (“MAP”) in the memory subsystem. The MAP may comprise one or more field programmable gate arrays (“FPGAs”) which function to perform identified algorithms in conjunction with, and tightly coupled to, a microprocessor and each MAP is globally accessible by all of the system processors for the purpose of executing user definable algorithms. A circuit within the MAP signals when the last operand has completed its flow thereby allowing a given process to be interrupted and thereafter restarted. Through the use of read only memory (“ROM”) located adjacent the FPGA, a user program may use a single command to select one of several possible pre-loaded algorithms thereby decreasing system reconfiguration time. A computer system memory structure MAP disclosed herein may function in normal or direct memory access (“DMA”) modes of operation and, in the latter mode, one device may feed results directly to another thereby allowing pipelining or parallelizing execution of a user defined algorithm. The system of the present invention also provides a user programmable performance monitoring capability and utilizes parallelizer software to automatically detect parallel regions of user applications containing algorithms that can be executed in the programmable hardware.
대표청구항▼
1. A system for processing data using a plurality of reconfigurable processors, the system comprising:a memory subsystem coupled to a data processor and including an addressable memory array; a first reconfigurable processor within the memory subsystem and coupled to a first address in the addressab
1. A system for processing data using a plurality of reconfigurable processors, the system comprising:a memory subsystem coupled to a data processor and including an addressable memory array; a first reconfigurable processor within the memory subsystem and coupled to a first address in the addressable memory array, wherein responsive to a first data value being written at the first address, the first reconfigurable processor performs a first configured function, generates a second data value, and writes the second data value to a second address in the addressable memory array; a second reconfigurable processor within the memory subsystem and coupled to the second address in the addressable memory array, wherein, responsive to the second data value being written at the second address, the second reconfigurable processor retrieves the second data and performs a second configured function; a control logic block in the memory subsystem in the communication path between the data processor and the addressable memory array for accessing data at specified addresses within the addressable memory array; a data bus and an address bus connecting the control logic block and the addressable memory array; a communication oath between the first reconfigurable processor and the address bus; and a control block in the communication path between the first reconfigurable processor and the address bus, wherein the control block comprises a command decoder for decoding commands from the data processor, a pipeline counter for counting clock cycles, an equality comparator for determining whether the output of the pipeline counter corresponds to a predetermined number of clock cycles, and status registers for receiving an output from the equality comparator. 2. The system of claim 1 wherein the second reconfigurable processor generates a third data value.3. The system of claim 1, further comprising a communication path between the first reconfigurable processor and the data bus.4. The system of claim 1 wherein the data processor transmits commands over the address bus.5. The system of claim 1 wherein the data processor periodically checks the status register.6. A method of data processing using reconfigurable processors, the method comprising:configuring a first reconfigurable processor within a memory subsystem to perform a first function; configuring a second reconfigurable processor within a memory subsystem to perform a second function; writing a first data value to a first memory address location in the memory subsystem; reading the first data value into a first reconfigurable processor within the memory subsystem; performing the first function in the first reconfigurable processor using the first data value to generate a second data value; writing the second data value to a second memory address within the memory subsystem; reading the second data value into a second reconfigurable processor within the memory subsystem; performing the second function in the second reconfigurable processor using the second data value to generate a third data value; receiving a command to terminate the data processing; counting the number of clock cycles that have elapsed since the command was received; and generating a signal when a predetermined number of clock cycles has passed. 7. The method of claim 6 wherein the third data value is written to a third memory location in the memory subsystem.8. The method of claim 6 wherein performing the first function includes multiplying.9. The method of claim 7 wherein configuring the first reconfigurable processor includes a fixed instruction set processor selecting configuration bits corresponding to the first function.10. The method of claim 9 wherein the fixed instruction set processor performing a math function.11. The method of claim 10 wherein the math function is a 64-bit floating point math function.12. The method of claim 9 further comprising:signaling the fixed instruction set processor when the third data value is available. 13. The method of claim 12 wherein the signaling includes writing a status value to a status register.14. The method of claim 6 wherein writing the second data value includes operatively passing the second data value from the first reconfigurable function unit to the second reconfigurable function unit.15. A computer system comprising:at least one processor; at least one circuit of direct execution logic; a common memory space accessible by said at least one processor and said at least one circuit of direct execution; and a unified executable program comprising a first portion thereof executable by said at least one processor and a second portion thereof executable by said at least one circuit of direct execution logic; wherein said at least one circuit of direct execution logic is programmed to perform at least one identified algorithm on an operand received from said common memory space. 16. The computer system of claim 15 wherein said at least one processor comprises a microprocessor.17. The computer system of claim 15 wherein said at least one circuit of direct execution logic comprises at least one field programmable gate array.18. The computer system of claim 15 wherein said at least one circuit of direct execution logic is operative to access said common memory space independently of said at least one processor.19. The computer system of claim 15 wherein said at least one identified algorithm is programmed into a memory device associated with said circuit of direct execution logic.20. The computer system of claim 19 wherein said memory device comprises at least one read only memory device.21. The computer system of claim 15 wherein said first portion of said unified executable program executable by said at least one processor is resident in said common memory space.22. The computer system of claim 15 said second portion of said unified executable program is resident in said at least one circuit of direct execution logic.23. The computer system of claim 15 wherein said second portion of said unified executable program is resident in said at least one field programmable gate array.24. The computer system of claim 15 wherein said at least one processor comprises a fixed instruction set processor.25. A method for operating a computer system comprising:providing at least one processor; providing at least one circuit of direct execution logic; enabling access by said at least one processor and said at least one circuit of direct execution logic to a common memory space; executing a unified executable program on said computer system such that a first portion of said unified executable program is executable by said at least one processor and a second portion of said unified executable program is executable by said at least one circuit of direct execution logic; wherein said common memory space is accessible by said at least one circuit of direct execution logic independently of said at least one processor. 26. The method of claim 25 wherein said step of providing at least one processor is carried out by a microprocessor.27. The method of claim 25 wherein said step of providing at least one processor is carried out by a fixed instruction set processor.28. The method of claim 25 wherein said step of providing at least one circuit of direct execution logic is carried out by at least one field programmable gate array.29. The method of claim 25 further comprising:programming said at least one circuit of direct execution logic to perform at least one identified algorithm received from said common memory space. 30. The method of claim 29 further comprising:storing said at least one identified algorithm in a memory device associated with said circuit of direct execution logic. 31. The method of claim 30 wherein said step of storing said at least one identified algorithm is carried out by a read only memory device.32. The method of claim 25 further comprising: storing said first portion of said unified executable program in said common memory space.33. The method of claim 25 further comprising:storing said second portion of said unified executable program in said at least one circuit of direct execution logic. 34. The method of claim 28 further comprising:storing said second portion of said unified executable program in said at least one field programmable gate array. 35. A system for processing data using a plurality of circuits of direct execution logic, said system comprising:at least one processor; a common memory space coupled to said at least one processor and said plurality of circuits of direct execution logic; a first one of said plurality of circuits of direct execution logic coupled to a first address in said common memory space and responsive to a first data value being written to said first address, said first one of said plurality of circuits of direct execution logic performing a first configured function in accordance with a unified executable program, generating a second data value and writing said second data value to a second address in said common memory space; a second one of said plurality of circuits of direct execution logic coupled to said second address in said common memory space and responsive to said second data value being written to said second address, said second one of said plurality of circuits of direct execution logic retrieving said second data value and performing a second configured function in accordance with said unified executable program; a first control logic block in a first communication path between said at least one processor and said common memory space for accessing data at specified addresses within said common memory space; a data bus and an address bus coupling said control logic block and said common memory space; a third communication oath between said first one of said plurality of circuits of direct execution logic and said address bus; a second control logic block in said third communication path between said first one of said plurality of circuits of direct execution logic and said address bus; where said second control logic block comprises a command decoder for decoding commands from said at least one processor, a pipeline counter for counting clock cycles, an equality comparator for determining whether an output of said pipeline counter corresponds to a predetermined number of said clock cycles and status registers for receiving an output from said equality comparator. 36. The system of claim 35 wherein said second one of said plurality of circuits of direct execution logic generates a third data value.37. The system of claim 35 further comprising a second communication path between said first one of said plurality of circuits of direct execution logic and said data bus.38. The system of claim 35 wherein said at least one processor transmits commands on said address bus.39. The system of claim 35 wherein said at least one processor periodically accesses said status register.40. The system of claim 35 wherein said first and second ones of said plurality of circuits of direct execution logic comprise field programmable gate arrays.41. The system of claim 35 wherein said first and second ones of said plurality of circuits of direct execution logic are operative to access said common memory space independently of said at least one processor.42. The system of claim 35 wherein said first one of said plurality of circuits of direct execution logic is programmed to perform at least one identified algorithm on an operand received from said common memory space.43. The system of claim 42 wherein said at least one identified algorithm is programmed into a memory device associated with said first one of said plurality of circuits of direct execution logic.44. The system of claim 43 wherein said memory device comprises at least one read only memory device.45. The system of claim 35 wherein a first portion of said unified executable program is resident in said common memory space for execution by said at least one processor.46. The system of claim 35 wherein a second portion of said unified executable program is resident in said first one of said plurality of circuits of direct execution logic.47. The system of claim 35 wherein said at least one processor comprises a fixed instruction set processor.48. A method for processing data utilizing circuits of direct execution logic coupled to a common memory space, said method comprising:configuring a first circuit of direct execution logic to perform a first function; configuring a second circuit of direct execution logic to perform a second function; writing a first data value to a first memory address location in said common memory space; reading said first data value into said first circuit of direct execution logic; performing said first function in said first circuit of direct execution logic using said first data value to generate a second data value; writing said second data value to a second memory address within said common memory space; reading said second data value into said second circuit of direct execution logic; performing said second function in said second circuit of direct execution logic using said second data value to generate a third data value; receiving a command to terminate processing of said data; counting a number of clock cycles that have elapsed since said command was received; and generating a signal when a predetermined number of clock cycles has passed. 49. The method of claim 48 wherein said third data value is written to a third memory location in said common memory space.50. The method of claim 48 wherein performing said first function includes multiplying.51. The method of claim 50 wherein configuring said first circuit of direct execution logic includes a includes at least one processor selecting configuration bits corresponding to said first function.52. The method of claim 51 wherein said at least one processor comprises a fixed instruction set processor.53. The method of claim 51 wherein said at least one processor performs a math function.54. The method of claim 53 wherein said math function comprises a 64-bit floating point math function.55. The method of claim 51 further comprising:signaling said at least one processor when said third data value is available. 56. The method of claim 55 wherein said signaling said at least one processor includes writing a status value to a status register.57. The method of claim 48 wherein writing said second data value includes operatively passing said second data value from said first circuit of direct execution logic to said second circuit of direct execution logic.58. The method of claim 51 wherein said configuring said first circuit of direct execution logic is carried out in accordance with a unified executable program.59. The method of claim 51 wherein said at least one processor is operative in accordance with said unified executable program.60. A computer system comprising:at least one processor; at least one circuit of direct execution logic; a common memory space accessible by said at least one processor and said at least one circuit of direct execution logic; and a unified executable program comprising a first portion thereof executable by said at least one processor and a second portion thereof executable by said at least one circuit of direct execution logic; wherein said at least one circuit of direct execution logic is operative to access said common memory space independently of said at least one processor. 61. The computer system of claim 60 wherein said at least one processor comprises a microprocessor.62. The computer system of claim 60 wherein said at least one circuit of direct execution logic comprises at least one field programmable gate array.63. The computer system of claim 60 wherein said at least one circuit of direct execution logic is programmed to perform at least one identified algorithm on an operand received from said common memory space.64. The computer system of claim 63 wherein said at least one identified algorithm is programmed into a memory device associated with said circuit of direct execution logic.65. The computer system of claim 64 wherein said memory device comprises at least one read only memory device.66. The computer system of claim 60 wherein said first portion of said unified executable program executable by said at least one processor is resident in said common memory space.67. The computer system of claim 60 wherein said second portion of said unified executable program is resident in said at least one circuit of direct execution logic.68. The computer system of claim 60 wherein said second portion of said unified executable program is resident in said at least one field programmable gate array.69. The computer system of claim 60 wherein said at least one processor comprises a fixed instruction set processor.70. A system for processing data using a plurality of circuits of direct execution logic, said system comprising:at least one processor; a common memory space coupled to said at least one processor and said plurality of circuits of direct execution logic; a first one of said plurality of circuits of direct execution logic coupled to a first address in said common memory space and responsive to a first data value being written to said first address, said first one of said plurality of circuits of direct execution logic performing a first configured function in accordance with a unified executable program, generating a second data value and writing said second data value to a second address in said common memory space; and a second one of said plurality of circuits of direct execution logic coupled to said second address in said common memory space and responsive to said second data value being written to said second address, said second one of said plurality of circuits of direct execution logic retrieving said second data value and performing a second configured function in accordance with said unified executable program; wherein said first and second ones of said plurality of circuits of direct execution logic are operative to access said common memory space independently of said at least one processor. 71. The system of claim 70 further comprising:a first control logic block in a first communication path between said at least one processor and said common memory space for accessing data at specified addresses within said common memory space. 72. The system of claim 71 further comprising a data bus and an address bus coupling said control logic block and said common memory space.73. The system of claim 72 further comprising a second communication path between said first one of said plurality of circuits of direct execution logic and said data bus.74. The system of claim 72 further comprising a third communication path between said first one of said plurality of circuits of direct execution logic and said address bus.75. The system of claim 74 further comprising a second control logic block in said third communication path between said first one of said plurality of circuits of direct execution logic and said address bus.76. The system of claim 75 where said second control logic block comprises a command decoder for decoding commands from said at least one processor, a pipeline counter for counting clock cycles, an equality comparator for determining whether an output of said pipeline counter corresponds to a predetermined number of said clock cycles and status registers for receiving an output from said equality comparator.77. The system of claim 76 wherein said at least one processor transmits commands on said address bus.78. The system of claim 76 wherein said at least one processor periodically accesses said status register.79. The system of claim 70 wherein said first and second ones of said plurality of circuits of direct execution logic comprise field programmable gate arrays.80. The system of claim 70 wherein said first one of said plurality of circuits of direct execution logic is programmed to perform at least one identified algorithm on an operand received from said common memory space.81. The system of claim 80 wherein said at least one identified algorithm is programmed into a memory device associated with said first one of said plurality of circuits of direct execution logic.82. The system of claim 81 wherein said memory device comprises at least one read only memory device.83. The system of claim 70 wherein a first portion of said unified executable program is resident in said common memory space for execution by said at least one processor.84. The system of claim 70 wherein a second portion of said unified executable program is resident in said first one of said plurality of circuits of direct execution logic.85. The system of claim 70 wherein said at least one processor comprises a fixed instruction set processor.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (14)
Casselman Steven M., Computer network of distributed virtual computers which are EAC reconfigurable in response to instruction to be executed.
Cooke Laurence H. ; Phillips Christopher E. ; Wong Dale, Method for compiling high level programming languages into an integrated processor with reconfigurable logic.
Huppenthal Jon M. ; Leskar Paul A., Multiprocessor computer architecture incorporating a plurality of memory algorithm processors in the memory subsystem.
Lytle Craig S. (Mountain View CA) Faria Donald F. (San Jose CA), Programmable logic array integrated circuit incorporating a first-in first-out memory.
Shido Tatsuya (Kawasaki JPX) Kawamura Kaoru (Yokohama JPX) Umeda Masanobu (Yokohama JPX) Shibuya Toshiyuki (Inagi JPX) Miwatari Hideki (Yokohama JPX), SIMD system having logic units arranged in stages of tree structure and operation of stages controlled through respectiv.
De Oliveira Kastrup Pereira, Bernardo; Bink, Adrianus J.; Hoogerbrugge, Jan, System for executing computer program using a configurable functional unit, included in a processor, for executing configurable instructions having an effect that are redefined at run-time.
Natoli, Vincent D.; Richie, David A., Reconfigurable computing system that shares processing between a host processor and one or more reconfigurable hardware modules.
Muraki,Shigeru; Ogata,Masato; Kajihara,Kagenori; Liu,Xuezhen; Koshizuka,Kenji, Simulation system having image generating function and simulation method having image generating process.
Tewalt, Timothy J., System and method for retaining dram data when reprogramming reconfigurable devices with DRAM memory controllers incorporating a data maintenance block colocated with a memory module or subsystem.
McGarry, Patrick F.; Sorber, David B.; Bresnan, Timothy P.; Huo, Andrew; Agrawal, Varun K.; Stupar, Brian R.; Griffin, Christopher M.; Barthelemy, Jeremy L.; Harris, Robert M.; Gantz, Paul C., Systems and methods for performing primitive tasks using specialized processors.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.