Separating a high-level programming language program into hardware and software components
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-017/50
출원번호
UP-0004834
(2007-12-21)
등록번호
US-7823117
(2010-11-15)
발명자
/ 주소
Bennett, David W.
출원인 / 주소
Xilinx, Inc.
대리인 / 주소
Maunu, LeRoy D.
인용정보
피인용 횟수 :
7인용 특허 :
15
초록▼
Various approaches are described for implementing a high-level programming language program in hardware and software components. In one approach, a method comprises compiling the high-level programming language program into a target language program that includes a plurality of functional elements.
Various approaches are described for implementing a high-level programming language program in hardware and software components. In one approach, a method comprises compiling the high-level programming language program into a target language program that includes a plurality of functional elements. Execution of the target language program is profiled to obtain execution counts of the functional elements. A subset of the functional elements are selected for implementation in programmable resources of a programmable device based on the profile data and availability of programmable resources. A bitstream is generated to implement a first sub-circuit that performs functions of the subset of functional elements, and the subset of functional elements is removed from the target language program. The programmable device is configured with the bitstream. The target language program is provided for execution by a processor.
대표청구항▼
What is claimed is: 1. A processor-implemented method for implementing a high-level programming language program in hardware and software components, comprising: compiling the high-level programming language program into a target language program that includes a plurality of functional elements; pr
What is claimed is: 1. A processor-implemented method for implementing a high-level programming language program in hardware and software components, comprising: compiling the high-level programming language program into a target language program that includes a plurality of functional elements; profiling execution of the target language program and storing profile data that specifies respective execution counts of the functional elements; selecting a subset of the functional elements for implementation in programmable resources of a programmable device based on the profile data and availability of the programmable resources to implement functions of the subset of functional elements; generating a bitstream that implements a first sub-circuit that performs functions of the subset of functional elements; wherein the generating the bitstream includes generating bits that implement a first soft processor and a second soft processor on the programmable device, wherein the first soft processor in executing a first part of the target language program provides input data to the first sub-circuit, and the second soft processor executing a second part of the target language program receives output data from the first sub-circuit; removing the subset of functional elements from the target language program; configuring the programmable device with the bitstream; and after the removing step, providing the target language program for execution by a processor. 2. The method of claim 1, wherein the profiling includes executing an interpreter on a soft processor that is implemented in programmable resources of the programmable device. 3. The method of claim 1, wherein the selecting of the subset of functional elements comprises: adding to the subset a functional element having a largest execution count and responsive to the execution count being greater than a threshold value; for each functional element not in the subset, adding the functional element to the subset in response to determining that the functional element provides input data to a functional element already in the subset and the functional element not in the subset having an execution count that is greater than the threshold value; and for each functional element not in the subset, adding the functional element to the subset in response to determining that the functional element receives output from a functional element already in the subset and the functional element not in the subset having an execution count that is greater than the threshold value. 4. The method of claim 3, wherein the selecting of a subset of functional elements comprises: determining a first quantity of the programmable resources required to implement the functions of the subset of functional elements; comparing the first quantity to a second quantity of the programmable resources available to implement the functions of the subset of functional elements; and wherein the generating, removing, and configuring are responsive to the second quantity being greater than the first quantity. 5. The method of claim 3, further comprising: generating a data flow graph having nodes corresponding to the functional elements in the target language program, wherein each edge that connects a first node to a second node represents that the functional element represented by the first node provides input data to the functional element represented by the second node; and wherein the determining that a functional element provides input data to another functional element and the determining that a functional element receives output from another functional element reference the data flow graph. 6. The method of claim 1, wherein the profiling includes simulating the execution of the target language program with a sample data set that is input to a processor external to the programmable device. 7. The method of claim 1, wherein the profiling includes simulating the execution of the target language program with a sample data set input to a soft processor that is implemented in the programmable resources of the programmable device. 8. The method of claim 1, wherein the profiling includes simulating the execution of the target language program with a sample data set input to a hard processor that is implemented on a single integrated circuit die with the programmable resources of the programmable device. 9. The method of claim 1, further comprising for each functional element, generating in the configuration bitstream configuration bits that implement an input FIFO buffer and an output FIFO buffer for input and output of data to and from the functional element. 10. The method of claim 1, wherein the programmable device comprises a field programmable gate array (FPGA). 11. An apparatus for implementing a high-level programming language program in hardware and software components, comprising: means for compiling the high-level programming language program into a target language program that includes a plurality of functional elements; means for profiling execution of the target language program and storing profile data that specifies respective execution counts of the functional elements; means for selecting a subset of the functional elements for implementation in programmable resources of a programmable device based on the profile data and availability of the programmable resources to implement functions of the subset of functional elements; means for generating a bitstream that implements a first sub-circuit that performs functions of the subset of functional elements; wherein the means for generating the bitstream generates bits that implement a first soft processor and a second soft processor on the programmable device, wherein the first soft processor in executing a first part of the target language program provides input data to the first sub-circuit, and the second soft processor executing a second part of the target language program receives output data from the first sub-circuit; means for removing the subset of functional elements from the target language program; means for configuring the programmable device with the bitstream; and means, responsive to completion of the removing step, for providing the target language program for execution by a processor. 12. An article of manufacture, comprising: a non-transitory processor-readable storage medium configured with processor-executable instructions for causing one or more processors to implement a high-level programming language program in hardware and software components by performing a series of steps including, compiling the high-level programming language program into a target language program that includes a plurality of functional elements; profiling execution of the target language program and storing profile data that specifies respective execution counts of the functional elements; selecting a subset of the functional elements for implementation in programmable resources of a programmable device based on the profile data and availability of the programmable resources to implement functions of the subset of functional elements; generating a bitstream that implements a first sub-circuit that performs functions of the subset of functional elements; wherein the generating of the bitstream generates bits that implement a first soft processor and a second soft processor on the programmable device, wherein the first soft processor in executing a first part of the target language program provides input data to the first sub-circuit, and the second soft processor executing a second part of the target language program receives output data from the first sub-circuit; removing the subset of functional elements from the target language program; configuring the programmable device with the bitstream; and after the removing step, providing the target language program for execution by a processor. 13. The article of manufacture of claim 12, wherein the selecting the subset of functional elements comprises: adding to the subset a functional element having a largest execution count and responsive to the execution count being greater than a threshold value; for each functional element not in the subset, adding the functional element to the subset in response to determining that the functional element provides input data to a functional element already in the subset and the functional element not in the subset having an execution count that is greater than the threshold value; and for each functional element not in the subset, adding the functional element to the subset in response to determining that the functional element receives output from a functional element already in the subset and the functional element not in the subset having an execution count that is greater than the threshold value. 14. The article of manufacture of claim 13, wherein the selecting the subset of functional elements comprises: determining a first quantity of the programmable resources required to implement the functions of the subset of functional elements; comparing the first quantity to a second quantity of the programmable resources available to implement the functions of the subset of functional elements; and wherein the generating, removing, and configuring are responsive to the second quantity being greater than the first quantity. 15. The article of manufacture of claim 13, wherein the series of steps further includes: generating a data flow graph having nodes corresponding to the functional elements in the target language program, wherein each edge that connects a first node to a second node represents that the functional element represented by the first node provides input data to the functional element represented by the second node; and wherein the determining that a functional element provides input data to another functional element and the determining that a functional element receives output from another functional element reference the data flow graph. 16. The article of manufacture of claim 12, wherein the profiling includes simulating the execution of the target language program with a sample data set input to a soft processor that is implemented in the programmable resources of the programmable device. 17. The article of manufacture of claim 12, wherein the profiling includes simulating the execution of the target language program with a sample data set input to a hard processor that is implemented on a single integrated circuit die with the programmable resources of the programmable device. 18. The article of manufacture of claim 12, wherein the series of steps further includes: for each functional element, generating in the configuration bitstream configuration bits that implement an input FIFO buffer and an output FIFO buffer for input and output of data to and from the functional element.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (15)
Earl A. Killian ; Ricardo E. Gonzalez ; Ashish B. Dixit ; Monica Lam ; Walter D. Lichtenstein ; Christopher Rowen ; John C. Ruttenberg ; Robert P. Wilson ; Albert Ren-Rui Wang ; Dror Eliezer, Automated processor generation system for designing a configurable processor and method for the same.
Rompaey Karl Van,BEX ; Verkest Diederik,BEX ; Vanhoof Jan,BEX ; Lin Bill,BEX ; Bolsens Ivo,BEX ; De Man Hugo,BEX, Design environment and a design method for hardware/software co-design.
Nguyen Le T. (Monte Sereno CA) Lentz Derek J. (Los Gatos CA) Miyayama Yoshiyuki (Santa Clara CA) Garg Sanjiv (Freemont CA) Hagiwara Yasuaki (Santa Clara CA) Wang Johannes (Redwood City CA) Lau Te-Li , High-performance, superscalar-based computer system with out-of-order instruction execution.
Master,Paul L.; Hogenauer,Eugene; Wu,Bicheng William; Chuang,Dan MingLun; Freeman Benson,Bjorn, Method, system and program for developing and scheduling adaptive integrated circuity and corresponding control or configuration information.
Martin,Nick; Stankovic,Dejan; Wells,Ben; Crasta,Denzil; Russell,Johnny F.; Rodway,Michael, System for designing re-programmable digital hardware platforms.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.