IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0192862
(2008-08-15)
|
등록번호 |
US-8332829
(2012-12-11)
|
발명자
/ 주소 |
|
출원인 / 주소 |
- Calos Fund Limited Liability Company
|
인용정보 |
피인용 횟수 :
3 인용 특허 :
4 |
초록
▼
Within a data processing system, one or more register files are assigned to respective states of a graph for each of a plurality of clock cycles. A plurality of edges are inserted to form connections between the states of the graph, with respective weights being assigned to each of the edges. A best
Within a data processing system, one or more register files are assigned to respective states of a graph for each of a plurality of clock cycles. A plurality of edges are inserted to form connections between the states of the graph, with respective weights being assigned to each of the edges. A best route through the graph is then determined based, at least in part, on the weights assigned to the edges.
대표청구항
▼
1. A computer-implemented method of communication scheduling within a processor including multiple register files, the method comprising: representing the multiple register files as vertices in a graph over multiple clock cycles such that the graph includes a vertex corresponding to each individual
1. A computer-implemented method of communication scheduling within a processor including multiple register files, the method comprising: representing the multiple register files as vertices in a graph over multiple clock cycles such that the graph includes a vertex corresponding to each individual register file during each of the multiple clock cycles;representing connections between the vertices in the graph as a plurality of edges that collectively describe possible sub-routes of data propagation among the multiple register files over the multiple clock cycles;assigning to each individual edge a weight representing a relative availability of a corresponding sub-route; anddetermining, via a machine-executable compiler executing on a computer, a schedule of communications within the processor, wherein the schedule of communications is represented as a route through the graph, and the route is based, at least in part, on weights assigned to the plurality of edges in the graph. 2. The method of claim 1, wherein the weights are assigned based, at least in part, on an amount of resource usage. 3. The method of claim 1, wherein the weights are assigned based, at least in part, on a fullness of the multiple register files. 4. The method of claim 1, wherein at least one of the plurality of edges form a connection between a first vertex corresponding to a first state of a data value in a first clock cycle and a second vertex corresponding to a second state of the data value in the first clock cycle. 5. The method of claim 1, wherein at least one of the plurality of edges form a connection between a first vertex corresponding to a first state of a data value in a first clock cycle and a second vertex corresponding to a second state of the data value in a second clock cycle. 6. The method of claim 5, wherein the first state of the data value corresponds to a first register file and wherein the second state of the data value also corresponds to the first register file. 7. The method of claim 6, wherein the at least one of the plurality of edges correspond to a continuous storing of the data value within the first register file. 8. The method of claim 5, wherein the first state of the data value corresponds to a first register file and wherein the second state of the data value corresponds to a second register file. 9. The method of claim 8, wherein the at least one of the plurality of edges comprise a pass operation for the data value. 10. The method of claim 5, wherein the first state of the data value corresponds to a first register file and wherein the second state of the data value corresponds to an on-chip memory. 11. The method of claim 10, wherein the at least one of the plurality of edges comprise a spill operation for the data value. 12. The method of claim 1, wherein the step of determining the schedule of communications represented as the route through the graph comprises determining the route through the graph based, at least in part, on a shortest-path algorithm. 13. The method of claim 1, wherein the step of determining the schedule of communications within the processor is determined in sub-quadratic time. 14. The method of claim 1, wherein the step of determining the schedule of communications within the processor comprises considering all possible permutations of routes for open communications. 15. The method of claim 1, wherein the step of determining the schedule of communications within the processor comprises arbitrarily selecting a specific route for each open communication. 16. The method of claim 1, wherein the step of assigning to each individual edge the weight representing the relative availability of the corresponding sub-route comprises one or more of: assigning a lower relative weight to edges to give preference to routes through the graph that avoid resource conflicts;assigning a lower relative weight to edges that reuse resources for multiple communications of the same output;assigning a lower relative weight to edges that minimize pass or spill operations; andassigning a lower relative weight to edges that do not occupy registers in near-full register files. 17. The method of claim 1, further comprising: updating the weights assigned to the plurality of edges during the communication scheduling to reflect a current schedule state. 18. The method of claim 1, wherein the processor includes at least one arithmetic logic unit and at least one crossbar configured to couple the at least one arithmetic logic unit to at least one of the multiple register files and the at least one arithmetic logic unit is configured to support a pass operation in which the at least one arithmetic logic unit drives a value from one of the multiple register files onto the at least one crossbar. 19. The method of claim 1, wherein the step of determining the schedule of communications within the processor includes adding a pass operation modeled as a series of edges and vertices that connects the vertex for a register file on cycle M with a vertex for a crossbar on cycle N, where N=M+pass latency−1, and the vertex for the crossbar on cycle N is connected to a set of register file vertices on cycle N+1. 20. The method of claim 1, wherein the graph includes for each register file a starts-in vertex and a stays-in vertex and each possible pass operation is represented by a series of vertices connecting at least one starts-in vertex to a crossbar vertex on cycle L, where L=pass latency−1. 21. A system configured to perform communication scheduling within a processor including multiple register files, the system comprising: means for representing the multiple register files as vertices in a graph over multiple clock cycles such that the graph includes a vertex corresponding to each individual register file during each of the multiple clock cycles;means for representing connections between the vertices in the graph as a plurality of edges that collectively describe possible sub-routes of data propagation among the multiple register files over the multiple clock cycles;means for assigning to each individual edge a weight representing a relative availability of a corresponding sub-route; andmeans for determining a schedule of communications within the processor, wherein the schedule of communications is represented as a route through the graph, and the route is based, at least in part, on the weights assigned to the plurality of edges in the graph. 22. The system of claim 21, wherein the means for determining the schedule of communications represented as the route through the graph comprises means for determining the route through the graph based, at least in part, on a shortest-path algorithm. 23. The system of claim 21, wherein the means for determining the schedule of communications within the processor comprises means for determining the route through the graph based, at least in part, on a shortest-path algorithm. 24. A non-transitory computer-readable medium having instructions stored thereon to perform communication scheduling within a processor including multiple register files, the instructions comprising: instructions for representing the multiple register files as vertices in a graph over multiple clock cycles such that the graph includes a vertex corresponding to each individual register file during each of the multiple clock cycles;instructions for representing connections between the vertices in the graph as a plurality of edges that collectively describe possible sub-routes of data propagation among the multiple register files over the multiple clock cycles;instructions for assigning to each individual edge a weight representing a relative availability of a corresponding sub-route; andinstructions for determining a schedule of communications within the processor, wherein the schedule of communications is represented as a route through the graph, and the route is based, at least in part, on weights assigned to the plurality of edges in the graph. 25. The non-transitory computer-readable medium of claim 24, wherein the instructions for determining the schedule of communications within the processor comprise instructions for determining the route through the graph based, at least in part, on a shortest-path algorithm. 26. A system comprising: a first processor configured to perform scheduling of communication within a second processor including multiple register files, wherein the first processor configured is to perform communication scheduling by: representing the multiple register files as vertices in a graph over multiple clock cycles such that the graph includes a vertex corresponding to each individual register file during each of the multiple clock cycles;representing connections between the vertices in the graph as a plurality of edges that collectively describe possible sub-routes of data propagation among the multiple register files over the multiple clock cycles;assigning to each individual edge a weight representing a relative availability of a corresponding sub-route; anddetermining a schedule of communications within the second processor, wherein the schedule of communications is represented as a route through the graph, and the route is based, at least in part, on weights assigned to the plurality of edges in the graph. 27. The system of claim 26, wherein the first processor is further configured to perform the determining of the schedule of communications within the second processor by determining the route through the graph based, at least in part, on a shortest-path algorithm. 28. A computer-implemented method for scheduling data transfers in a processor including multiple register files and one or more arithmetic logic units, the method comprising: determining, via a machine-executable compiler executing on a computer, a sequence of data transfers over multiple clock cycles to transfer an output of one of the arithmetic logic units to an input of one of the arithmetic logic units by using a shortest-path algorithm that solves a problem modeled as a route through a graph that includes (1) a vertex corresponding to each individual register file during each of the multiple clock cycles and (2) a plurality of edges that form connections between the vertices and describe possible data transfers among the multiple register files over the multiple clock cycles, wherein the shortest-path algorithm determines the route through the graph based, at least in part, on weights assigned to the plurality of edges in the graph, and each individual edge is assigned a weight representing a relative availability of a corresponding sub-route.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.