Reconfigurable data interface unit for compute systems
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-013/16
G06F-015/78
G06F-013/42
G06F-013/40
출원번호
US-0069700
(2016-03-14)
등록번호
US-10185699
(2019-01-22)
발명자
/ 주소
Wang, Qiang
Gu, Zhenguo
Li, Qiang
Wang, Zhuolei
출원인 / 주소
Futurewei Technologies, Inc.
대리인 / 주소
Vierra Magen Marcus LLP
인용정보
피인용 횟수 :
0인용 특허 :
14
초록▼
A system-on-chip includes a reconfigurable data interface to prepare data streams for execution patterns of a processing unit in a flexible compute accelerate system. An apparatus is provided that includes a first set of line buffers configured to store a plurality of data blocks from a memory of a
A system-on-chip includes a reconfigurable data interface to prepare data streams for execution patterns of a processing unit in a flexible compute accelerate system. An apparatus is provided that includes a first set of line buffers configured to store a plurality of data blocks from a memory of a system-on-chip and a field composition circuit configured to generate a plurality of data segments from each of the data blocks. The field composition circuit is reconfigurable to generate the data segments according to a plurality of reconfiguration schemes. The apparatus includes a second set of line buffers configured to communicate with the field composition circuit to store the plurality of data segments for each data block, and a switching circuit configured to generate from the plurality of data segments a plurality of data streams according to an execution pattern of a processing unit of the system-on-chip.
대표청구항▼
1. An apparatus, comprising: a first set of line buffers configured to receive and store, for a first data cycle, a plurality of data blocks from a memory of a system-on-chip (SoC) via at least one data bus, wherein each data block has a first data structure and a first bit width;a field composition
1. An apparatus, comprising: a first set of line buffers configured to receive and store, for a first data cycle, a plurality of data blocks from a memory of a system-on-chip (SoC) via at least one data bus, wherein each data block has a first data structure and a first bit width;a field composition circuit configured to generate a plurality of data segments from each of the data blocks according to a plurality of reconfiguration schemes, the generating including decomposing each data block of the plurality from the first set of line buffers into the plurality of data segments, and each data segment has a second bit width that is less than the first bit width;a second set of line buffers configured to communicate with the field composition circuit to store, for a second data cycle following the first data cycle, the plurality of data segments for each data block;a switching circuit configured to generate from the plurality of data segments a plurality of data streams according to an execution pattern of a processing unit of the SoC;a set of input/output (I/O) buffers configured to store, for a third data cycle following the second data cycle, the plurality of data streams;a set of streaming buffers storing data of a first processing unit based on selectively reading from each I/O buffer; anda reconfigurable data interface (RDIU) receiving the plurality of data blocks from a plurality of data buses. 2. The apparatus of claim 1, further comprising: a set of multiplexers coupled between the second set of line buffers and the set of I/O buffers, each multiplexer including a plurality of inputs coupled to a subset of the second set of line buffers and an output coupled to a corresponding I/O buffer, each multiplexer configured to select an input corresponding to a selected line buffer of the second set according to a reconfigurable MUX selector circuit. 3. The apparatus of claim 2, further comprising: a first set of address generation units (AGU) coupled to the second set of line buffers, each address generation unit configured to selectively read from an output of a corresponding line buffer of the second set according to an address indicated by the AGU for a corresponding data cycle. 4. The apparatus of claim 3, further comprising: a second set of AGUs coupled to the set of I/O buffers, each AGU of the second set configured to selectively read from an output of a corresponding I/O buffer according to an address indicated by the AGU for a corresponding data cycle. 5. The apparatus of claim 4, wherein: each AGU of the second set configured to selectively read from the output of the corresponding I/O buffer according to the execution pattern of the processing unit of the SoC. 6. The apparatus of claim 5, further comprising: a third set of AGUs coupled to the set of I/O buffers, each AGU of the third set configured to selectively write data to a corresponding I/O buffer according to an address indicated by the AGU for a corresponding data cycle. 7. The apparatus of claim 6, wherein the set of multiplexers is a first set of multiplexers, the apparatus further comprising: a second set of multiplexers between the second set of line buffers and the set of I/O buffers, each multiplexer of the second set including an input coupled to a subset of the I/O buffers and an output coupled to a corresponding line buffer of the second set, each multiplexer of the second set configured to select an input corresponding to a selected I/O buffer according to the reconfigurable MUX selector circuit. 8. The apparatus of claim 1, wherein: the execution pattern of the processing unit is a first execution pattern of a plurality of execution patterns of the first processing unit; andthe switching circuit is reconfigurable to generate from the plurality of data segments the plurality of data streams according to the plurality of execution patterns of the processing unit. 9. A method of data processing by a system-on-chip, comprising: storing a plurality of data blocks in a first set of line buffers for a first data cycle, wherein each data block has a first data structure and a first bit width;generating from each of the plurality of data blocks a plurality of data segments, the generating including decomposing each data block of the plurality from the first set of line buffers into the plurality of data segments, and each data segment has a second bit width that is less than the first bit width;storing the plurality of data segments for each data block in a second set of line buffers for a second data cycle following the first data cycle;selectively reading from the second set of line buffers to combine portions of data segments from multiple data blocks to form a plurality of data streams;storing the plurality of data streams in a set of input/output (I/O) buffers for a third data cycle following the second data cycle and based on a plurality of execution patterns for a processing unit of a system-on-chip (SoC);storing data in a set of streaming buffers of a first processing unit based on selectively reading from each I/O buffer; andreceiving at a reconfigurable data interface unit (RDIU) the plurality of data blocks from a plurality of data buses. 10. The method of claim 9, wherein selectively reading from the second set of line buffers comprises: selectively reading from each second line buffer according to an address indicated by a corresponding address generation unit (AGU) from a first set of AGUs coupled to the second set of line buffers. 11. The method of claim 9, further comprising: selectively reading from each I/O buffer according to an address indicated by a corresponding AGU from a second set of AGUs coupled to the set of I/O buffers. 12. The method of claim 9, wherein: selectively reading from each line buffer of the second set includes reading according to a first pattern defined by the first set of AGUs; andselectively reading from each I/O buffer includes reading according to a second pattern defined by the second set of AGUs. 13. The method of claim 12, further comprising: storing in the set of I/O buffers result data from the first processing unit;determining one or more memory addresses associated with the result data;storing in the second set of line buffers reorganized data based on the one or more memory addresses of the result data; andcomposing the reorganized data into data blocks for transmission on the plurality of data buffers. 14. The method of claim 13, wherein: composing the reorganized data into data blocks includes providing a specific address for each data block. 15. The method of claim 14, wherein: the data selectively read from the set of I/O buffers according to the second set of AGUs is a data stream in the second pattern that matches over a plurality of data cycles with one or more data paths inside the first processing unit. 16. The method of claim 9, wherein: the plurality of data streams have a second data structure that is different from the first data structure; andthe plurality of data streams have a second bit width that is less than the first bit width. 17. A system-on-chip, comprising: one or more non-transitory memory devices comprising instructions;a plurality of buses coupled to the one or more memory devices;a plurality of compute systems coupled to the plurality of buses, each compute system comprising one or more processing units to execute the instructions to: store a plurality of data blocks in a first set of line buffers for a first data cycle, wherein each data block has a first data structure and a first bit width;generate from each of the plurality of data blocks a plurality of data segments, the generating including decomposing each data block of the plurality from the first set of line buffers into the plurality of data segments, and each data segment has a second bit width that is less than the first bit width;store the plurality of data segments for each data block in a second set of line buffers for a second data cycle following the first data cycle;selectively read from the second set of line buffers to combine portions of data segments from multiple data blocks to form a plurality of data streams;store the plurality of data streams in a set of input/output (I/O) buffers for a third data cycle following the second data cycle and based on a plurality of execution patterns for a processing unit of a system-on-chip (SoC);store data in a set of streaming buffers of a first processing unit based on selectively reading from each I/O buffer; andreceive at a reconfigurable data interface unit (RDIU) the plurality of data blocks from a plurality of data buses. 18. The system-on-chip of claim 17, wherein the one or more processing units are unit is a field programmable gate arrays. 19. The system-on-chip of claim 17, wherein selectively reading from the second set of line buffers comprises selectively reading from each second line buffer according to an address indicated by a corresponding address generation unit (AGU) from a first set of AGUs coupled to the second set of line buffers. 20. The system-on-chip of claim 17, wherein the one or more processing units further execute the instructions to selectively reading from each I/O buffer according to an address indicated by a corresponding AGU from a second set of AGUs coupled to the set of I/O buffers. 21. The system-on-chip of claim 17, wherein: selectively reading from each line buffer of the second set includes reading according to a first pattern defined by the first set of AGUs; andselectively reading from each I/O buffer includes reading according to a second pattern defined by the second set of AGUs.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (14)
Ramchandran, Amit, Adaptable datapath for a digital processing system.
Master, Paul L.; Hogenauer, Eugene; Scheuermann, Walter James, Adaptive integrated circuitry with heterogeneous and reconfigurable matrices of diverse and adaptive computational units having fixed, application specific computational elements.
Eric C. Peters ; Stanley Rabinowitz ; Herbert R. Jacobs ; Peter J. Fasciano, Computer system and process for transferring multiple high bandwidth streams of data between multiple storage units and multiple applications in a scalable and reliable manner.
Campi,Fabio; Toma,Mario; Lodi,Andrea; Cappelli,Andrea; Canegallo,Roberto; Guerrieri,Roberto, Digital architecture for reconfigurable computing in digital signal processing.
Rashid Richard F. (Woodinville WA) Bolosky William J. (Issaquah WA) Fitzgerald Robert P. (Redmond WA), Method and system for combining data from multiple servers into a single continuous data stream using a switch.
Sim,Siew Young; Chan,Desmond Cho Hung; Huang,Tsan Fung; Chai,Wencheng; Isaacson,Trygve; Flood, Jr.,James C.; Mills,George Harlow; Orzen,Matthew, Method and system for managing distributed content and related metadata.
Feldman, Israel; Trinker, Arie; Meltzer, Yochai; Eshpar, Allon; Lotem, Amnon, Method and system for modeling and processing vehicular traffic data and information and applying thereof.
Stager, Roger Keith; Trimmer, Don Alvin; Johnston, Craig Anthony; Chang, Yafen Peggy; Lau, Jerry Kai, Optimized disk repository for the storage and retrieval of mostly sequential data.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.