Energy-focused re-compilation of executables and hardware mechanisms based on compiler-architecture interaction and compiler-inserted control
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-009/45
G06F-009/30
출원번호
US-0212737
(2014-03-14)
등록번호
US-9569186
(2017-02-14)
발명자
/ 주소
Chheda, Saurabh
Carver, Kristopher
Ashok, Raksit
출원인 / 주소
III HOLDINGS 2, LLC
대리인 / 주소
Schwabe, Williamson & Wyatt
인용정보
피인용 횟수 :
0인용 특허 :
226
초록▼
A method comprising of analyzing and transforming a program executable at compile-time such that a processor design objective is optimized. A method including analyzing an executable to estimate energy consumption of an application component in a processor. A method including transforming an executa
A method comprising of analyzing and transforming a program executable at compile-time such that a processor design objective is optimized. A method including analyzing an executable to estimate energy consumption of an application component in a processor. A method including transforming an executable to reduce energy consumption in a processor. A processor framework controlled by compiler inserted control that statically exposes parallelism in an instruction sequence. A processor framework to reduce energy consumption in an instruction memory system with compiler inserted control.
대표청구항▼
1. A method, comprising: inputting source files into a source-level compiler to obtain a binary executable;analyzing a representation of the binary executable to identify information relating to micro-operations of components of a microprocessor;responsive to analyzing the representation of the bina
1. A method, comprising: inputting source files into a source-level compiler to obtain a binary executable;analyzing a representation of the binary executable to identify information relating to micro-operations of components of a microprocessor;responsive to analyzing the representation of the binary executable to identify the information relating to the micro-operations of the components of the microprocessor, generating control information comprising instructions that contain data that renders unnecessary at least one of the micro-operations;transforming the representation of the binary executable obtained from the source-level compiler into a binary executable that is different than the binary executable obtained from the source-level compiler, the transformation including combining the generated control information with at least one of instructions of the representation of the binary executable; andpredicting inactive periods of resources in the microprocessor using static information produced through a compilation of a computer program;wherein the control information is configured to: control the resources during the inactive periods so as to reduce energy consumption of the resources during the inactive periods;reduce an amount of voltage supplied to the resources during the inactive periods relative to an amount of voltage supplied to the resources during active periods of the resources; andprecharge a cache of the resources to reduce leakage in bitlines associated with the cache. 2. The method of claim 1, wherein transforming the representation of the binary executable comprises removing a branched instruction from the instructions of the representation of the binary executable. 3. The method of claim 1, further comprising: generating the representation of the binary executable by replacing a first instruction sequence corresponding to the binary executable obtained from the source-level compiler with a second instruction sequence that is different than the first instruction sequence. 4. The method of claim 3, wherein the representation of the binary executable comprises program blocks, wherein analyzing the representation of the binary executable comprises identifying a subset of the program blocks associated with a specific criticality, and the method further comprises: generating the second instruction sequence responsive to identifying the subset of the program blocks associated with the specific criticality. 5. The method of claim 1, wherein transforming the representation of the binary executable further comprises rearranging the instructions of the representation of the binary executable. 6. The method of claim 5, wherein rearranging the instructions of the representation of the binary executable obtained from the source-level compiler further comprises reordering the instructions of the representation of the binary executable obtained from the source-level compiler to increase an amount of parallel instructions. 7. The method of claim 1, wherein analyzing the representation of the binary executable comprises identifying instructions that may be executed in parallel, and wherein the control information identifies instructions in the binary executable obtained responsive to the transformation that may be executed in parallel. 8. The method of claim 7, wherein the control information comprises a control bit preceding the instructions in the binary executable obtained responsive to the transformation that may be executed in parallel. 9. The method of claim 7, wherein identifying instructions that may be executed in parallel further comprises identifying parallel instructions in blocks of instructions associated with the binary executable obtained from the source-level compiler. 10. The method of claim 7, wherein identifying instructions that may be executed in parallel further comprises exposing parallel instructions in critical loops in the binary executable obtained from the source-level compiler. 11. The method of claim 7, wherein the control information comprises data to be removed from an instruction sequence of the binary executable obtained responsive to the transformation before a corresponding portion of the instruction sequence enters a pipeline of the microprocessor. 12. The method of claim 7, wherein the control information is usable during executing of the binary executable obtained responsive to the transformation to improve operation of the microprocessor relative to execution of the binary executable obtained from the source-level compiler. 13. The method of claim 12, wherein the control information is usable to reduce energy consumption of the microprocessor during execution of the binary executable obtained responsive to the transformation. 14. A memory device having instructions stored thereon that, in response to execution by a processing device, cause the processing device to perform operations comprising: analyzing a representation of a first binary executable to identify information relating to micro-operations of components of a microprocessor;responsive to analyzing the representation of the first binary executable to identify the information relating to the micro-operations of the components of the microprocessor, generating control information comprising instructions that contain data that renders unnecessary at least one of the micro-operations;transforming the representation of the first binary executable into a second binary executable that is different than the first binary executable, the transformation including combining the generated control information with at least one of instructions of the representation of the first binary executable; andpredicting inactive periods of resources in the microprocessor using static information produced through a compilation of a computer program;wherein the control information is configured to: control the resources during the inactive periods so as to reduce energy consumption of the resources during the inactive periods;reduce an amount of voltage supplied to the resources during the inactive periods relative to an amount of voltage supplied to the resources during active periods of the resources; andprecharge a cache of the resources to reduce leakage in bitlines associated with the cache. 15. The memory device of claim 14, wherein transforming the representation of the first binary executable further comprises rearranging the instructions of the representation of the first binary executable. 16. The memory device of claim 14, wherein transforming the representation of the first binary executable comprises removing a branched instruction from the instructions of the representation of the first binary executable. 17. The memory device of claim 14, wherein the cache comprises first and second memory structures, the first memory structure being smaller than the second memory structure; and wherein precharging is to be performed on the second memory structure but not on the first memory structure. 18. The memory device of claim 14, wherein the cache comprises first and second memory structures, the first memory structure being smaller than the second memory structure; wherein the static information identifies instructions in the computer program, the instructions in the computer program having instruction footprints that can be accommodated by the first memory structure;wherein the first memory structure is used for the instructions in the computer program; andwherein the control information is configured to reduce voltage to the second memory structure during execution of the instructions in the computer program. 19. The memory device of claim 14, wherein the operations further comprise generating the representation of the first binary executable by replacing a first instruction sequence corresponding to the first binary executable with a second instruction sequence that is different than the first instruction sequence. 20. The memory device of claim 19, wherein the representation of the binary first executable comprises program blocks, wherein analyzing the representation of the first binary executable comprises identifying a subset of the program blocks associated with a specific criticality, and the operations further comprise: generating the second instruction sequence responsive to identifying the subset of the program blocks associated with the specific criticality.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (226)
Radigan James J. (Sunnyvale CA) Schwartz David A. (Moorpark CA), Activity masking with mask context of SIMD processors.
Briggs Willard S. (Carrollton TX) Gant Alan D. (Dallas TX) Gupta Parveen K. (Carrollton TX) Ferson Isadore S. (Austin TX), Address-controlled automatic bus arbitration and address modification.
van Hook Timothy J. ; Hsu Peter ; Huffman William A. ; Moreton Henry P. ; Killian Earl A., Alignment and ordering of vector elements for single instruction multiple data processing.
McAuliffe Keven P. (Yorktown Heights NY) Melton Evelyn A. (Poughkeepsie NY) Norton Vern A. (Croton-On-Hudson NY) Pfister Gregoty F. (Briarcliff Manor NY) Wakefield Scott P. (Croton-On-Hudson NY), Aperiodic mapping system using power-of-two stride access to interleaved devices.
O\Keefe David B. (Tyngsboro MA) Cassarino ; Jr. Frank V. (Weston MA) Riikonen Douglas L. (Westford MA), Architecture for a microprogrammed device controller.
Blomgren James S. (San Jose CA) Cohen Earl T. (Fremont CA) Baird Brian R. (Pleasanton CA), Block-based branch prediction using a target finder array storing target sub-addresses.
Tetsuya Tanaka JP; Takao Yamamoto JP, Branch prediction method and processor using origin information, relative position information and history information.
Redford, John, Branching around conditional processing if states of all single instruction multiple datapaths are disabled and the computer program is non-deterministic.
Weinberg Tobias M. (Somerville MA) Tennies Lisa A. (Bedford MA) Vasilevsky Alexander D. (Watertown MA), Compiling a source code vector instruction by generating a subgrid loop for iteratively processing array elements by plu.
Dmitry M. Maslennikov RU; Valentine G. Tikhonov RU; Alexander I. Kasinsky RU; Vladimir Y. Volkonsky RU, Computer method and apparatus for compilation of multi-way decisions.
Daniel Richard A. (Escondido CA) Rowson Stuart C. (Escondido CA) Barnhart James E. (St. Peters MO) Paek Woonsuk (Fremont CA), Computer system clock generator for generating tuned multiple clock signals.
Puziol David L. ; Van Dyke Korbin S. ; Widigen Larry ; Shar Len ; Smith ; III Walstein Bennett, Configurable branch prediction for a processor performing speculative execution.
Puziol, David L.; Van Dyke, Korbin S.; Widigen, Larry; Shar, Len; Smith, III, Walstein Bennett, Configurable branch prediction for a processor performing speculative execution.
Moritz, Csaba Andras; Krishna, Mani; Koren, Israel; Unsal, Osman Sabri, Controlling a processor resource based on a compile-time prediction of number of instructions-per-cycle that will be executed across plural cycles by the processor.
Gotou Shizuo (Hachiouji JPX) Kagimasa Toyohiko (Kokubunji JPX) Yoshizumi Seiichi (Hino PA JPX) Shintani Yooichi (Pittsburgh PA), Data processor with control of the significant bit lengths of general purpose registers.
Blomgren James S. (San Jose CA), Dual instruction set processor having a pipeline with a pipestage functional unit that is relocatable in time and sequen.
Datar, Rajendra; Ghanekar, Sachin; Gogte, Ravindra; Gracias, Sebastian, Dynamically activating and deactivating selected circuit blocks of a data processing integrated circuit during execution of instructions according to power code bits appended to selected instructions.
Santhanam Vatsa (Campbell CA), Efficient explicit data prefetching analysis and code generation in a low-level optimizer for inserting prefetch instruc.
Richter David E. (San Jose CA) Pattin Jay C. (Redwood City CA) Blomgren James S. (San Jose CA), Emulating operating system calls in an alternate instruction set using a modified code segment descriptor.
Sullivan Timothy J. (Clinton MA) Burns Cynthia J. (Franklin MA) Andrade Albert T. (North Grafton MA) Frangioso ; Jr. Ralph C. (Franklin MA), Expandable memory system and method for interleaving addresses among memory banks of different speeds and sizes.
Moore Victor S. (Deerfield Beach FL) Kraft Wayne R. (Coral Springs FL) Rhodes ; Jr. Joseph C. (Boca Raton FL) Stahl ; Jr. William L. (Coral Springs FL), Flexible processor on a single semiconductor substrate using a plurality of arrays.
Fijany Amir (Sherman Oaks CA) Bejczy Antal K. (Pasadena CA), Highly parallel reconfigurable computer architecture for robotic computation having plural processor cells each having r.
Lazaravich Robert V. (Chandler AZ) Kuester Jill L. (Mesa AZ), Instruction accelerator for processing loop instructions with address generator using multiple stored increment values.
Grondalski Robert S. (Maynard MA), Massively parallel array processing system with processors selectively accessing memory module locations using address i.
Wade Jon P. ; Cassiday Daniel R. ; Lordi Robert D. ; Steele ; Jr. Guy Lewis ; St. Pierre Margaret A. ; Wong-Chan Monica C. ; Abuhamdeh Zahi S. ; Douglas David C. ; Ganmukhi Mahesh N. ; Hill Jeffrey V, Massively parallel computer including auxiliary vector processor.
Mowry Todd C. (Palo Alto CA) Killian Earl A. (Los Altos CA), Method and apparatus for reducing delays following the execution of a branch instruction in an instruction pipeline.
Fong Anthony S.,HKX, Method and apparatus to specify access control list and cache enabling and cache coherency requirement enabling on individual operands of an instruction of a computer.
Dubey Pradeep Kumar,INX ; Olsson Brett ; Hochsprung Ronald Ray ; Scales ; III Hunter Ledbetter ; Diefendorff Keith Everett, Method and system for a result code for a single-instruction multiple-data predicate compare operation.
Patel Rajesh Bhikhubhai ; Jessani Romesh Mangho ; Kuttana Belliappa Manavattira, Method and system for dynamically sharing cache capacity in a microprocessor.
Kahle James A. ; Mallick Soummya ; McDonald Robert G. ; Swarthout Edward L., Method and system for executing a program within a multiscalar processor by processing linked thread descriptors.
Agarwal Ramesh Chandra ; Groves Randall Dean ; Gustavson Fred Gehrung ; Johnson Mark Alan ; Olsson Brett, Method and system for providing a single-instruction, multiple-data execution unit for performing single-instruction, mu.
Nishiyama Hiroyasu,JPX ; Kikuchi Sumio,JPX ; Mori Noriyasu,JPX ; Nishimoto Akira,JPX ; Takeuchi Yooichi,JPX, Method for controlling a processor for power-saving in a computer for executing a program, compiler medium and processo.
Lin Derrick Chu ; Tagare Varsha P. ; Vakkalagadda Ramamohan Rao, Method for reducing peak power in dispatching instructions to multiple execution units.
Hillis W. Daniel (Cambridge MA) Lasser Clifford (Boston MA) Kahle Brewster (Somerville MA) Sims Karl (Somerville MA), Method of simulating additional processors in a SIMD parallel processor array.
Revilla Juan Guillermo ; Barry Edwin F. ; Marchand Patrick Rene ; Pechanek Gerald G., Methods and apparatus to dynamically reconfigure the instruction pipeline of an indirect very long instruction word scalable processor.
Maher Robert (Carrollton TX) Garibay ; Jr. Raul A. (Plano TX) Herubin Margaret R. (Coppell TX) Bluhm Mark (Carrollton TX), Microprocessor with externally controllable power management.
Chang Ki S. (Houston TX) Patrick Michael W. (Houston TX) Sacarisen Stephen P. (Houston TX) Stambaugh Mark A. (Houston TX), Microprocessor with integrated CPU, RAM, timer, and bus arbiter for data communications systems.
Emer Joel S. ; Steely Simon ; McLellan Edward J., Multiprobe instruction cache with instruction-based probe hint generation and training whereby the cache bank or way to.
Hillis W. Daniel (Cambridge MA) Douglas David C. (Concord MA) Leiserson Charles E. (Winchester MA) Kuszmaul Bradley C. (Waltham MA) Ganmukhi Mahesh N. (Wexford PA) Hill Jeffrey V. (San Jose CA) Wong-, Parallel computer system with physically separate tree networks for data and control messages.
Kim Won S. (Fremont CA) Nickolls John R. (Los Altos CA), Parallel processor system with highly flexible local control capability, including selective inversion of instruction si.
Yeager Kenneth C. ; Khurshid Mazin S., Pipeline processor with enhanced method and apparatus for restoring register-renaming information in the event of a bran.
Mohammad, Saleem Chisty, Power saving in a USB peripheral by providing gated clock signal to CSR block in response to a local interrupt generated when an operation is to be performed.
Jackson James H. (Cary) Lee Ming-Chih (Cary NC), Processor array with relocated operand physical address generator capable of data transfer to distant physical processor.
Hammond Gary N. (Campbell CA) Kahn Kevin C. (Portland OR) Alpert Donald B. (Santa Clara CA), Processor capable of executing programs that contain RISC and CISC instructions.
Jignesh Trivedi ; Tse-Yu Yeh, Processor executing plural instruction sets (ISA's) with ability to have plural ISA's in different pipeline stages at same time.
Guttag Karl M. (Houston TX) Laws Gerald E. (Austin TX), Psuedo-microprogramming in microprocessor in single-chip microprocessor with alternate IR loading from internal or exter.
Luca Giuseppe De Ambroggi IT; Fabrizio Campanale IT; Salvatore Nicosia IT; Francesco Tomaiuolo IT; Promod Kumar IT, Redundancy architecture for an interleaved memory.
Levinthal Adam E. (Corte Madera CA) Porter Thomas K. (Fairfax CA) Duff Thomas D. S. (No. Plainfield NJ) Carpenter Loren C. (Novato CA), Selective operation of processing elements in a single instruction multiple data stream (SIMD) computer system.
Hazama Katsuki,JPX, Semiconductor device with security protection function, ciphering and deciphering method thereof, and storage medium for storing software therefor.
Sprague David L. (10 Jacobs Creek Rd. Trenton NJ 08628) Harney Kevin (30 Reeve Pl. Brooklyn NY 11218) Kowashi Eiichi (6234 Kaitlyn Ct. Lawrenceville NJ 08648) Keith Michael (14 Quail Dr. Holland PA 1, Simd with selective idling of individual processors based on stored conditional flags, and with consensus among all flag.
Sharangpani Harshvardhan ; Fielden Kent, Storing predicted branch target address in different storage according to importance hint in branch prediction instruction.
Tran Thang M. ; Witt David B., Superscalar microprocessor which delays update of branch prediction information in response to branch misprediction unti.
Doshi Gautam B. ; Markstein Peter ; Karp Alan H. ; Huck Jerome C. ; Colon-Bonet Glenn T. ; Morrison Michael, System and method for deferring exceptions generated during speculative execution.
Baxter Michael A., System and method for dynamically reconfigurable computing using a processing unit having changeable internal hardware organization.
Gingold David Bruce (Somerville MA) Crouch Kenneth Walter (Cambridge MA) Lasser Clifford Adam (Cambridge MA) Bromley Harry Mark (Andover MA) Steele ; Jr. Guy Lewis (Lexington MA), System and method of mapping an array to processing elements.
Johnson William M. (San Jose CA), System for reducing delay for execution subsequent to correctly predicted branch instruction using fetch information sto.
Boufarah Edmond J. (Austin TX) Grohoski Gregory F. (Cedar Park TX) Lee Chien-Chyun (Ausin TX) Moore Charles R. (Ausin TX), System for reducing delay in instruction execution by executing branch instructions in separate processor while dispatch.
Hebbalalu S. Ramagopal ; David B. Witt ; Michael Allen ; Moinul Syed ; Ravi Kolagotla ; Lawrence A. Booth, Jr. ; William C. Anderson, System having a configurable cache/SRAM memory.
Blomgren James S. ; Brashears Cheryl Senter, Temporal re-alignment of a floating point pipeline to an integer pipeline for emulation of a load-operate architecture.
Gilbert Ira H. (Carlisle MA) Ciccia Nicodemo A. (North Reading MA), Translator for translating source code for selective unrolling of loops in the source code.
Koyanagi Yoichi (Kawasaki JPX) Horie Takeshi (Kawasaki JPX), Vector processor having a mask register used for performing nested conditional instructions.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.