Computer for Amdahl-compliant algorithms like matrix inversion
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-007/00
G06F-017/16
출원번호
US-0500103
(2010-10-07)
등록번호
US-8892620
(2014-11-18)
국제출원번호
PCT/US2010/051876
(2010-10-07)
§371/§102 date
20120404
(20120404)
국제공개번호
WO2011/044398
(2011-04-14)
발명자
/ 주소
Jennings, Earle
Landers, George
출원인 / 주소
QSigma, Inc.
대리인 / 주소
Jennings, Earle
인용정보
피인용 횟수 :
1인용 특허 :
33
초록▼
A family of computers is disclosed and claimed that supports simultaneous processes from the single core up to multi-chip Program Execution Systems (PES). The instruction processing of the instructed resources is local, dispensing with the need for large VLIW memories. The cores through the PES have
A family of computers is disclosed and claimed that supports simultaneous processes from the single core up to multi-chip Program Execution Systems (PES). The instruction processing of the instructed resources is local, dispensing with the need for large VLIW memories. The cores through the PES have maximum performance for Amdahl-compliant algorithms like matrix inversion, because the multiplications do not stall and the other circuitry keeps up. Cores with log based multiplication generators improve this performance by a factor of two for sine and cosine calculations in single precision floating point and have even greater performance for loge and ex calculations. Apparatus specifying, simulating, and/or layouts of the computer (components) are disclosed. Apparatus the computer and/or its components are disclosed.
대표청구항▼
1. A computer, comprising: at least one multiplication generator configured to create a multiplication; and at least one other circuit configured to respond to said multiplication,with said computer configured to implement an Amdahl-compliant algorithm and stall said multiplication less than NMult p
1. A computer, comprising: at least one multiplication generator configured to create a multiplication; and at least one other circuit configured to respond to said multiplication,with said computer configured to implement an Amdahl-compliant algorithm and stall said multiplication less than NMult percent with said other circuit keeping up with said multiplication, with said NMult less than ten, andwith said Amdahl-compliant algorithm configured on conventional computers to include a parallel part and a sequential part. 2. The computer of claim 1, further comprising a core comprising a Simultaneous Process state Calculator (SPC) configured to generate at least two process indexes; and at least two instructed resources, each simultaneously instructed by a local instruction generated based upon one of said process indexes,with said multiplication generator as one of said instructed resources. 3. The computer of claim 2, further comprising at least one condition code based upon at least one arithmetic operation creating a multi-way condition to direct said SPC to alter at least one of said process indexes across greater than three index values. 4. The computer of claim 3, further comprising a range clamp configured to receive an input number to create a range limited input and a range determination as said condition code. 5. The computer of claim 2, further comprising at least two instruction zone indications configured to be received by an instruction zone selector to create a selected instruction zone presented to said SPC to direct generation of said process indexes as part of a Program Execution Unit (PEU). 6. The computer of claim 5, further comprising a task indication (Task ID) configured to be received by said instruction zone selector to create said selected instruction zone as said part of said PEU for a task indicated by said Task ID. 7. The computer of claim 2, further comprising at least one instance of a data memory capable of providing at least one input to at least one of said instructed resources, said multiplication generator and/or said other circuit. 8. The computer of claim 1, further comprising at least one comparator configured to receive at least two operand packages to create a resultant operand package based upon the status of an arithmetic result generated within the comparator, with each of the operand packages including at least one data configured for use as an operand to create said arithmetic result andan index list containing at least one index. 9. The computer of claim 8, comprising an instructed resource configured to respond to a process index contained in said index list in at least one of said operand packages and/or said resultant operand package. 10. The computer of claim 9, comprising another of said instructed resource configured to create said process index in said index list. 11. The computer of claim 1, further comprising at least one queue configured to provide data availability stimulus through its queue status, with said data availability contributing at least part of an output of one of a feed forward, an internal feedback, an external feedback, an input portal and another instructed resource. 12. The computer of claim 11, wherein said internal feedback is within said core; wherein said external feedback is between a first instance of said core and a second instance of said core through a landing bidirectionally communicating with an feedback input portal in each of said instances of said cores and a feedback output portal containing said queue in each of said instances of said cores; andwherein at least one of said feed forward, said internal feedback, said external feedback and said another instructed resource includes a second of said queues also configured to provide said data availability stimulus from a second of said queue status. 13. The computer of claim 12, wherein said internal feedback includes a first feedback input and a second feedback input, each configured to provide data to a separate of said queues; and wherein said external feedback forms a bidirectional binary tree with instances of said landings coupling upward to another instance of said landing to create a sequential feedback network;wherein said computer further comprises a communication network between said cores using at least one version of said landings configured to receive from an output portal of said core and configured to present to said input portal to affect said queue. 14. The computer of claim 13, further comprising another of said instructed resource coupled with at least one of a local instruction processor and a sub-process index generator, with said local instruction processor configured to at least partly respond to said data availability stimulus to generate a local instruction used to operate said instructed resource, andwith said sub-process index generator configured to respond to said data availability stimulus to create a sub-process index configured for use by one of said instructed resources to at least partly generate another local instruction for said one of said instructed resources; andwherein said sequential feedback network has a fixed latency and continuous throughput. 15. An apparatus implementing at least part of said computer of claim 1 with at least one of a disk drive, a download package, and a computer readable memory, with said apparatus containing at least one of a specification, a simulation, a product of simulating, a netlist, and a layout component;wherein said Amdahl compliant algorithm includes a version of matrix inversion;wherein said comparator includes at least one member of the group consisting of a comparator, a Floating Point (FP) comparator and a comparative FP adder;wherein said multiplication generator includes one of a multiplier and a log-domain-circuit including an exponential calculator configured to receive a log-domain-result to create said multiplication. 16. An apparatus including at least part of said computer of claim 1, wherein said apparatus implements at least one of a disk drive, a handheld device, a wearable device, a cellular phone, a Digital Signal Processor (DSP), a numeric processor, a graphics accelerator, a base station, an access point, a micro-processor and a server.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (33)
Langhammer, Martin; Nguyen, Triet M.; Lin, Yi-Wen, Adder-rounder circuitry for specialized processing block in programmable logic device.
Lloyd Scott Edward (Mesa AZ) Pan Shao Wei (Schaumburg IL) Wang Shay-Ping Thomas (Long Grove IL), Computer processor having a pipelined architecture which utilizes feedback and method of using same.
Guttag Karl M. (Missouri City TX) Simpson Richard (Bedford GB2) Walsh Brendan (Bedford GB2), Three input arithmetic logic unit forming mixed arithmetic and boolean combinations.
Guttag Karl M. ; Balmer Keith,GBX ; Gove Robert J. ; Read Christopher J. ; Golston Jeremiah E. ; Poland Sydney W. ; Ing-Simmons Nicholas,GBX ; Moyse Phillip,GBX, Three input arithmetic logic unit with barrel rotator and mask generator.
Guttag Karl M. ; Balmer Keith,GBX ; Gove Robert J. ; Read Christopher J. ; Golston Jeremiah E. ; Poland Sydney W. ; Ing-Simmons Nicholas,GBX ; Moyse Philip,GBX, Three input arithmetic logic unit with shifter and/or mask generator.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.