Buffer sizing of a NoC through machine learning
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
H04L-012/861
H04L-012/24
출원번호
US-0402819
(2017-01-10)
등록번호
US-10063496
(2018-08-28)
발명자
/ 주소
Norige, Eric
Rao, Nishant
Kumar, Sailesh
출원인 / 주소
NETSPEED SYSTEMS INC.
대리인 / 주소
Procopio, Cory, Hargreaves & Savitch LLP
인용정보
피인용 횟수 :
0인용 특허 :
89
초록▼
The present disclosure is directed to buffer sizing of NoC link buffers by utilizing incremental dynamic optimization and machine learning. A method for configuring buffer depths associated with one or more network on chip (NoC) is disclosed. The method includes deriving characteristics of buffers a
The present disclosure is directed to buffer sizing of NoC link buffers by utilizing incremental dynamic optimization and machine learning. A method for configuring buffer depths associated with one or more network on chip (NoC) is disclosed. The method includes deriving characteristics of buffers associated with the one or more NoC, determining first buffer depths of the buffers based on the characteristics derived, obtaining traces based on the characteristics derived, measuring trace skews based on the traces obtained, determining second buffer depths based on the trace skews measured, optimizing the buffer depths associated with the network on chip (NoC) based on the second buffer depths, and configuring the buffer depths associated with one or more network on chip (NoC) based on the buffer depths optimized.
대표청구항▼
1. A method for generating a Network on Chip (NoC), comprising: executing a first process directed to derivation of arrival and departure characteristics of at least one buffer associated with the NoC;executing a second process directed to derivation of at least one buffer depth of the at least one
1. A method for generating a Network on Chip (NoC), comprising: executing a first process directed to derivation of arrival and departure characteristics of at least one buffer associated with the NoC;executing a second process directed to derivation of at least one buffer depth of the at least one buffer based on the arrival and the departure characteristics and further based on one or more characteristics of the NoC; andgenerating the NoC based on the at least one buffer depth;wherein the first process is machine learning based process configured to determine arrival rate of packets and drain rate of packets based on an arbitration process of the NoC. 2. The method according to claim 1, wherein the arrival and departure characteristics are selected from any or a combination of the arrival rate of the packets, burst size, round trip time (RTT), multicast packet size, the drain rate of the packets, store and forward feature, and arbitration frequency/link frequency. 3. The method according to claim 1 further comprising: executing a third process directed to optimize the at least one buffer depth to generate at least one second buffer depth through a first simulation of the NoC in isolation with the at least one buffer associated with the NoC; andexecuting a fourth process to optimize the at least one second buffer depth to generate at least one third buffer depth through a second simulation of the NoC and at least one system element associated with the NoC;wherein the generating the NoC based on the at least one buffer depth is based on the at least one third buffer depth. 4. The method according to claim 3, wherein the first simulation is adapted to generate an input trace behavior based on historical output trace behavior associated with at least one other NoC adjacent to the NoC. 5. The method according to claim 3, wherein the fourth process is configured to select the at least one buffer to decrease the at least one buffer depth based on a cost function, and wherein the decrease in the at least one buffer depth is performed repeatedly until a threshold is achieved for the cost function. 6. The method according to claim 3, wherein the fourth process is configured to: create a probability distribution of the at least one buffer depth for the at least one buffer based on the at least one second buffer depth;conduct one or more second simulations based on a sampling of the probability distribution of the at least one buffer depth;rank the one or more second simulations based on a cost function; andobtain the at least one third buffer depth for at least one buffer from the one or more second simulations ranked upon occurrence of a probability distribution convergence. 7. The method according to claim 3, wherein the one or more characteristics of the NoC comprises at least one trace skew, and wherein the at least one second process is a machine learning based process configured to select the at least one buffer depth to generate the at least one second buffer depth for optimization based on the at least one trace skew. 8. A system for generation of a Network on Chip (NoC), comprising: a memory coupled to the processor, wherein the memory stores one or more computer programs executable by the processor;wherein the computer programs are executable to: execute a first process wherein the first process derives arrival and departure characteristics of at least one buffer associated with the NoC;execute a second process wherein the second process derives at least one buffer depth of the at least one buffer based on the arrival and the departure characteristics and further based on one or more characteristics of the NoC;generate the NoC based on the at least one buffer depth;wherein the first process is machine learning based process configured to determine arrival rate of packets and drain rate of packets based on an arbitration process of the NoC. 9. The system according to claim 8, wherein the arrival and departure characteristics are selected from any or a combination of the arrival rate of the packets, burst size, round trip time (RTT), multicast packet size, the drain rate of the packets, store and forward feature, and arbitration frequency/link frequency. 10. The system according to claim 8, wherein the computer programs are further executable to: execute a third process wherein the third process optimizes the at least one buffer depth to generate at least one second buffer depth through a first simulation of the NoC in isolation with the at least one buffer associated with the NoC; andexecute a fourth process wherein the fourth process optimizes the at least one second buffer depth to generate at least one third buffer depth through a second simulation of the NoC and at least one system element associated with the NoC;wherein the NoC generated based on the at least one buffer depth is based on the at least one third buffer depth. 11. The system according to claim 10, wherein the first simulation is adapted to generate an input trace behavior based on historical output trace behavior associated with at least one other NoC adjacent to the NoC. 12. The system according to claim 10, wherein the fourth process is configured to select the at least one buffer to decrease the at least one buffer depth based on a cost function, and wherein the decrease in the at least one buffer depth is performed repeatedly until a threshold is achieved for the cost function. 13. The system according to claim 10, wherein the fourth process is configured to: create a probability distribution of the at least one buffer depth for the at least one buffer based on the at least one second buffer depth;conduct one or more second simulations based on a sampling of the probability distribution of the at least one buffer depth;rank the one or more second simulations based on a cost function; andobtain the at least one third buffer depth for at least one buffer from the one or more second simulations ranked upon occurrence of a probability distribution convergence. 14. The system according to claim 8, wherein the one or more characteristics of the NoC comprises at least one trace skew, and wherein the at least one second process is a machine learning based process configured to select the at least one buffer depth to generate the at least one second buffer depth for optimization based on the at least one trace skew. 15. A non-transitory computer readable storage medium storing instructions for executing a process, the instructions comprising: executing a first process directed to derivation of arrival and departure characteristics of at least one buffer associated with the NoC;executing a second process directed to derivation of at least one buffer depth of the at least one buffer based on the arrival and the departure characteristics and further based on one or more characteristics of the NoC; andgenerating the NoC based on the at least one buffer depth, wherein the first process is a machine learning based process configured to determine arrival rate of packets and drain rate of packets based on an arbitration process of the NoC. 16. The non-transitory computer readable storage medium according to claim 15, wherein the arrival and departure characteristics are selected from any or a combination of the arrival rate of the packets, burst size, round trip time (RTT), multicast packet size, the drain rate of the packets, store and forward feature, and arbitration frequency/link frequency. 17. The non-transitory computer readable storage medium according to claim 15, the instructions further comprising: executing a third process directed to optimize the at least one buffer depth to generate at least one second buffer depth through a first simulation of the NoC in isolation with the at least one buffer associated with the NoC; andexecuting a fourth process to optimize the at least one second buffer depth to generate at least one third buffer depth through a second simulation of the NoC and at least one system element associated with the NoC;wherein the generating the NoC based on the at least one buffer depth is based on the at least one third buffer depth. 18. The non-transitory computer readable storage medium according to claim 17, wherein the first simulation is adapted to generate an input trace behavior based on historical output trace behavior associated with at least one other NoC adjacent to the NoC. 19. The non-transitory computer readable storage medium according to claim 17, wherein the fourth process is configured to select the at least one buffer to decrease the at least one buffer depth based on a cost function, and wherein the decrease in the at least one buffer depth is performed repeatedly until a threshold is achieved for the cost function. 20. The non-transitory computer readable storage medium according to claim 17, wherein the fourth process is configured to: create a probability distribution of the at least one buffer depth for the at least one buffer based on the at least one second buffer depth;conduct one or more second simulations based on a sampling of the probability distribution of the at least one buffer depth;rank the one or more second simulations based on a cost function; andobtain the at least one third buffer depth for at least one buffer from the one or more second simulations ranked upon occurrence of a probability distribution convergence. 21. The non-transitory computer readable storage medium according to claim 17, wherein the one or more characteristics of the NoC comprises at least one trace skew, and wherein the at least one second process is a machine learning based process configured to select the at least one buffer depth to generate the at least one second buffer depth for optimization based on the at least one trace skew.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (89)
Or-Bach, Zvi; Wurman, Ze'ev, 3D integrated circuit with logic.
Hahn Jong Seok,KRX ; Sim Won Sae,KRX ; Hahn Woo Jong,KRX ; Yoon Suk Han,KRX, Adaptive routing controller of a crossbar core module used in a crossbar routing switch.
Dapp Michael C. (Endwell NY) Barker Thomas N. (Vestal NY) Dieffenderfer James W. (Owego NY) Knowles Billy J. (Kingston NY) Lesmeister Donald M. (Vestal NY) Nier Richard E. (Apalachin NY) Rolfe David , Advanced parallel processor including advanced support hardware.
Miller,Ian D.; Harris,Jonathan C., Auto generation of a multi-staged processing pipeline hardware implementation for designs captured in high level languages.
Agrawal Rakesh ; Gehrke Johannes Ernst ; Gunopulos Dimitrios ; Raghavan Prabhakar, Automatic subspace clustering of high dimensional data for data mining applications.
Thubert, Pascal; Le Faucheur, Francois Laurent; Levy-Abegnoli, Eric M., Forwarding packets to a directed acyclic graph destination using link selection based on received link metrics.
Flaig Charles M. (Pasadena CA) Seitz Charles L. (San Luis Rey CA), Inter-computer message routing system with each computer having separate routinng automata for each dimension of the net.
Fuhrmann Amir Michael ; Rakib Selim Shlomo ; Azenkot Yehuda, Lower overhead method for data transmission using ATM and SCDMA over hybrid fiber coax cable plant.
Hilgendorf Rolf B. (Boeblingen DEX) Schlipf Thomas (Holzgerlingen DEX), Method and apparatus for avoiding deadlock in a computer system with two or more protocol-controlled buses interconnecte.
Okhmatovski, Vladimir; Yuan, Mengtao; Phelps, Rodney, Method and apparatus for broadband electromagnetic modeling of three-dimensional interconnects embedded in multilayered substrates.
Williams, Jr., John J.; Dejanovic, Thomas; Michelson, Jonathan E., Method and apparatus for using barrier phases to limit packet disorder in a packet switching system.
James David V. ; North Donald N. ; Stone Glen D., Method and system for avoiding starvation and deadlocks in a split-response interconnect of a computer system.
Levin Vladimir K.,RUX ; Karatanov Vjacheslav V.,RUX ; Jalin Valerii V.,RUX ; Titov Alexandr,RUX ; Agejev Vjacheslav M.,RUX ; Patrikeev Andrei,RUX ; Jablonsky Sergei V.,RUX ; Korneev Victor V.,RUX ; M, Method for deadlock-free message passing in MIMD systems using routers and buffers.
Kalmanek, Jr., Charles Robert; Lauck, Anthony G; Ramakrishnan, Kadangode K., Method for determining non-broadcast multiple access (NBMA) connectivity for routers having multiple local NBMA interfaces.
Bruce,Alistair Crone; Mathewson,Bruce James; Harris,Antony John, Method of arbitrating between a plurality of transfers to be routed over a corresponding plurality of paths provided by an interconnect circuit of a data processing apparatus.
Kodialam, Muralidharan S.; Lakshman, Tirnuell V.; Sengupta, Sudipta, Multicast routing with service-level guarantees between ingress egress-points in a packet network.
Hoover, Russell D.; Kriegel, Jon K.; Mejdrich, Eric O.; Shearer, Robert A., Network on chip with a low latency, high bandwidth application messaging interconnect.
Mejdrich, Eric O.; Schardt, Paul E.; Shearer, Robert A.; Tubbs, Matthew R., Performance event triggering through direct interthread communication on a network on chip.
Koza John R. ; Andre David ; Tackett Walter Alden, Simultaneous evolution of the architecture of a multi-part program while solving a problem using architecture altering operations.
Pleshek, Ronald A.; Webb, III, Charles A.; Cheney, Keith E.; Hilton, Gregory S.; Abkowitz, Patricia A.; Thakkar, Arun K.; Thaker, Himanshu M., Superset packet forwarding for overlapping filters and related systems and methods.
Prasad,Roy V.; Horng,Chi Song; Ramanujam,Ram S., System and method for reducing patterning variability in integrated circuit manufacturing through mask layout corrections.
Birrittella Mark S. (Chippewa Falls WI) Kessler Richard E. (Eau Claire WI) Oberlin Steven M. (Chippewa Falls WI) Passint Randal S. (Chippewa Falls WI) Thorson Greg (Altoona WI), System for allocating messages between virtual channels to avoid deadlock and to optimize the amount of message traffic.
Jayasimha, Doddaballapur N.; Chan, Jeremy; Tomlinson, Jay S., Use of common data format to facilitate link width conversion in a router with flexible link widths.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.