VLIW computer processing architecture having a scalable number of register files
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-015/80
G06F-015/76
출원번호
US-0802289
(2001-03-08)
발명자
/ 주소
Saulsbury,Ashley
Parkin,Michael
Rice,Daniel S.
출원인 / 주소
Sun Microsystems, Inc.
대리인 / 주소
Townsend and Townsend and Crew LLP
인용정보
피인용 횟수 :
7인용 특허 :
35
초록▼
According to the invention, a processing core is disclosed. The processing core includes one or more processing pipelines and a number of register flies. The processing pipelines having a total of N-number of processing paths, where each of the processing paths processes instructions on M-bit data
According to the invention, a processing core is disclosed. The processing core includes one or more processing pipelines and a number of register flies. The processing pipelines having a total of N-number of processing paths, where each of the processing paths processes instructions on M-bit data words. Each of the number of register files has Q-number of registers that are each M-bits wide. The Q-number of registers within each of the plurality of register files are either private or global registers. When a value is written to one of said Q-number of said registers, which is a global register within one of said number of register files, the value is propagated to a corresponding global register in the other of the number of register files. When a value is written to one of said Q-number of the registers, which is a private register within one of said number of register files, the value is not propagated to a corresponding register in the other of said number of register files.
대표청구항▼
What is claimed is: 1. A processing core comprising: one or more processing pipelines having a total of N-number of processing paths, each of said processing paths for processing instructions on M-bit data words; and a plurality of register files, each having Q-number of registers, said Q-number of
What is claimed is: 1. A processing core comprising: one or more processing pipelines having a total of N-number of processing paths, each of said processing paths for processing instructions on M-bit data words; and a plurality of register files, each having Q-number of registers, said Q-number of registers being M-bits wide; wherein said Q-number of registers within each of said plurality of register files are both private and global registers, and wherein when a value is written to one of said Q-number of said registers which is a global register within one of said plurality of register files, said value is propagated to a corresponding global register in the other of said plurality of register files, and wherein when a value is written to one of said Q-number of said registers which is a private register within one of said plurality of register files, said value is not propagated to a corresponding register in the other of said plurality of register files, wherein each of said Q-number of registers is bi-modal to programmably operate in both private and global modes. 2. The processing core as recited in claim 1, wherein for even values of N that are greater than one, every two of said N-number of processing paths share one of said plurality of register files. 3. The processing core as recited in claim 1, wherein a processing instruction comprises N-number of P-bit instructions appended together to form a very long instruction word (VLIW), and said N-number of processing paths process N-number of P-bit instructions in parallel. 4. The processor chip as recited in claim 3, wherein M=64, Q=64, and P=32. 5. The processing core as recited in claim 1, wherein said processing pipeline comprises an execute stage which includes an execute unit for each of said N-number of M-bit processing paths, each of said execute units comprising an integer processing unit, a load/store processing unit, a floating point processing unit, or any combination of one or more of said integer processing units, said load/store processing units, and said floating point processing units. 6. The processing core as recited in claim 5, wherein an integer processing unit and a floating point processing unit share one of said plurality of register files. 7. The processing core as recited in claim 1, wherein Q=64, and a 64-bit special register stores bits indicating whether a register in a register file is a private register or a global register, each bit in the 64-bit special register corresponding to one of said registers in said register file. 8. The processing core as recited in claim 1, wherein each of said plurality of register files is connected to a bus, and a value written to a global register in one of said plurality of register files is propagated to a corresponding global register in the other of said plurality of register files across said bus. 9. The processing core as recited in claim 1, wherein said plurality of register files are connected together in serial, and a value written to a first global register in a first of said plurality of register files is propagated to a corresponding first global register in a second of said plurality of register files connected directly to said first of said plurality of register files. 10. A VLIW processing core comprising: one or more processing pipelines each including a fetch stage, a decode stage, an execute stage, and a write-back stage, said execute stage having an execute unit comprising an integer processing unit, a load/store processing unit, a floating point processing unit, or any combination of one or more of said integer processing units, said load/store processing units, or said floating point processing units; and a register file for each of said one or more processing pipelines; wherein: an integer processing unit and a floating point processing unit within said one or more processing pipelines both access said register file, the register file is comprised of Q-number of registers, said Q-number of registers comprise both private and global registers, whereby each of said Q-number of registers is dynamically configurable to operate in both private and global modes, when a value is written to a one of said Q-number of said registers that is a configured to global register mode within one of said plurality of register files, said value is propagated to a corresponding global register in another register file within the VLIW processing core, and when a value is written to the one of said Q-number of said registers that is configured to private register mode within one of said plurality of register files, said value is not propagated to a corresponding register in another register file within the VLIW processing core. 11. In a computer system, a scalable computer processing architecture, comprising: one or more processor chips, each comprising: a processing core, including: a processing pipeline having N-number of processing paths, each of said processing paths for processing instructions on M-bit data words; and a plurality of register files, each having Q-number of registers, said Q-number of registers being M-bits wide; an I/O link configured to communicate with other of said one or more processor chips, if more than one, or with I/O devices; a communication controller in electrical communication with said processing core and said I/O link; said communication controller for controlling the exchange of data between a first one of said one or more processor chips and said other of said one or more processor chips; wherein: said computer processing architecture can be scaled larger by connecting together two or more of said processor chips in parallel via said I/O links of said processor chips, so as to create multiple processing core pipelines which share data therebetween, said Q-number of registers within each of said plurality of register files comprise both private and global registers, whereby each of said Q-number of registers is bi-modal to switch between private and global modes, when a value is written to a one of said Q-number of said registers which is switched to global register mode within one of said plurality of register files, said value is propagated to a corresponding global register in the other of said plurality of register files, and when a value is written to the one of said Q-number of said registers which is switched to private register mode within one of said plurality of register flies, said value is not propagated to a corresponding register in the other of said plurality of register files. 12. The computer processing architecture as recited in claim 11, wherein in said processing core of each of said processor chips, for even values of N that are greater than one, every two of said N-number of processing paths share one of said plurality of register files. 13. The computer processing architecture as recited in claim 11, wherein a processing instruction comprises N-number of P-bit instructions appended together to form a very long instruction word (VLIW) , and said N-number of processing paths process N-number of P-bit instructions in parallel. 14. The computer processing architecture as recited in claim 13, wherein M=64, Q=64, and P=32. 15. The computer processing architecture as recited in claim 11, wherein said processing pipeline comprises an execute stage which includes an execute unit for each of said N-number of M-bit processing paths, each of said execute units comprising an integer processing unit, a load/store processing unit, a floating point processing unit, or any combination of one or more of said integer processing units, said load/store processing units, and said floating point processing units. 16. The computer processing architecture as recited in claim 15, wherein an integer processing unit and a floating point processing unit share one of said plurality of register files. 17. The computer processing architecture as recited in claim 11, wherein Q=64, and a 64-bit special register stores bits indicating whether a register in a register file is a private register or a global register, each bit in the 64-bit special register corresponding to one of said registers in said register file. 18. The computer processing architecture as recited in claim 11, wherein each of said plurality of register files is connected to a bus, and a value written to a global register in one of said plurality of register files is propagated to a corresponding global register in the other of said plurality of register files across said bus. 19. The computer processing architecture as recited in claim 18, wherein said plurality of register files are connected together in serial, and a value written to a first global register in a first of said plurality of register files is propagated to a corresponding first global register in a second of said plurality of register files connected directly to said first of said plurality of register files. 20. The processing core as recited in claim 1, wherein said Q-number of registers within each of said plurality of register files can switch between being either private or global registers.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (35)
Kumar Rajendra (Sunnyvale CA) Emerson Paul G. (San Jose CA), Cache memory system having secondary cache integrated with primary cache for use with VLSI circuits.
Jouppi Norman P. (Palo Alto CA) Eustace Alan (Palo Alto CA), Data processing system and method with small fully-associative cache and prefetch buffers.
Cook Peter W. (Mount Kisco NY), IC chips including ALUs and identical register files whereby a number of ALUs directly and concurrently write results to.
Cushing David E. (Chelmsford MA) Kelly Richard P. (Nashua NH) Ledoux Robert V. (Litchfield NH) Shen Jian-Kuo (Belmont MA), Mechanism for automatically updating multiple unit register file memories in successive cycles for a pipelined processin.
Engdahl Jonathan R. (Chardon OH) Gee David J. (Ann Arbor MI) Lucak Mark A. (Hudson OH) Adams Shawn L. (Rocky River OH), Method and apparatus for exchanging different classes of data during different time intervals.
Boggs Darrell D. (Aloha OR) Colwell Robert P. (Portland OR) Fetterman Michael A. (Hillsboro OR) Glew Andrew F. (Hillsboro OR) Gupta Ashwani K. (Beaverton OR) Hinton Glenn J. (Portland OR) Papworth Da, Method and apparatus for maintaining a macro instruction for refetching in a pipelined processor.
Thomas L. Drabenstott ; Gerald G. Pechanek ; Edwin F. Barry ; Charles W. Kurak, Jr., Methods and apparatus to support conditional execution in a VLIW-based array processor with subword execution.
Levy Henry M. ; Eggers Susan J. ; Lo Jack ; Tullsen Dean M., Shared register storage mechanisms for multithreaded computer systems with out-of-order execution.
Rim Min-Joong,KRX, System for fetching unit instructions and multi instructions from memories of different bit widths and converting unit instructions to multi instructions by adding NOP instructions.
Masubuchi Yoshio (Kawasaki JPX), Very large instruction word type computer for performing a data transfer between register files through a signal line pa.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.