Embodiments for programming a graphics pipeline, and modules within the graphics pipeline, are detailed herein. Several of these embodiments utilize offset registers associated with the instruction tables for the modules within the pipeline. The offset register serves as a pointer to locations in th
Embodiments for programming a graphics pipeline, and modules within the graphics pipeline, are detailed herein. Several of these embodiments utilize offset registers associated with the instruction tables for the modules within the pipeline. The offset register serves as a pointer to locations in the instruction table, which allows instructions to be written to be instruction table, without requiring that the shader programs have explicit addresses. One embodiment describes a method of programming a graphics pipeline. This method involves accessing the shader program stored in memory. A shader instruction is generated from this shader program, and loaded into an instruction table associated with a target module graphics pipeline. The shader instruction is loaded into the instruction table at the location indicated by an offset register.
대표청구항▼
1. A method of programming a graphics pipeline, said method comprising: accessing a shader program stored in a memory;generating a shader instruction from the shader program; andloading the shader instruction into an instruction table associated with a target module in the graphics pipeline at a loc
1. A method of programming a graphics pipeline, said method comprising: accessing a shader program stored in a memory;generating a shader instruction from the shader program; andloading the shader instruction into an instruction table associated with a target module in the graphics pipeline at a location indicated by an offset register, wherein the loading is responsive to a program sequencer sending the shader instruction, and wherein the target module comprises the offset register. 2. The method of claim 1, further comprising: configuring the program sequencer to control the graphics pipeline. 3. The method of claim 2, wherein said configuring the program sequencer comprises loading a plurality of command instructions into a command table associated with the program sequencer. 4. The method of claim 2, wherein the program sequencer is operable to receive a series of command instructions from a GPU driver. 5. The method of claim 1, wherein said accessing the shader program comprises performing a direct memory access (DMA) transfer of an instruction block associated with the shader program. 6. The method of claim 1, wherein said generating the shader instruction comprises generating a register packet. 7. The method of claim 6, wherein the register packet comprises an address field and a data field, and wherein the address field indicates which of a plurality of modules in the graphics pipeline for which the register packet is intended. 8. The method of claim 7, wherein said loading the shader instruction comprises: identifying the register packet as intended for the target module;extracting the shader instruction from the data field of the register packet; andinserting the shader instruction into the instruction table at a position indicated by the offset register. 9. The method of claim 1, wherein the shader program stored in memory does not include an explicit instruction table offset. 10. The method of claim 1, further comprising: updating the offset register to indicate an available position in the instruction table. 11. A graphics processing unit (GPU) for loading a shader program, said GPU comprising: an integrated circuit die comprising a plurality of stages of the GPU;a memory interface for interfacing with a graphics memory; anda host interface for interfacing with a computer system, and wherein the plurality of stages comprises a graphics pipeline configured to:access a shader program stored in the graphics memory;generate a shader instruction from the shader program; andload the shader instruction into an instruction table associated with one of the plurality of stages of the graphics pipeline at a position indicated by an offset register, wherein the shader instruction is loaded responsive to a program sequencer sending the shader instruction, and wherein the one of the plurality of stages of the graphics pipeline comprises the offset register. 12. The GPU of claim 11, wherein the program sequencer is configured to control the graphics pipeline. 13. The GPU of claim 12, wherein the program sequencer is configured by loading a plurality of command instructions into a command table associated with the program sequencer. 14. The GPU of claim 11, wherein said accessing the shader program comprises performing a direct memory access (DMA) transfer of an instruction block associated with the shader program. 15. The GPU of claim 11, wherein said generating the shader instruction comprises generating a register packet, having an address field and a data field, and wherein the address field indicates which of the plurality of stages in the graphics pipeline for which the register packet is intended. 16. The GPU of claim 15, wherein the graphics pipeline is configured to load the shader instruction by: identifying the register packet as intended for the target module;extracting the shader instruction from the data field of the register packet; andinserting the shader instruction into the instruction table at a position indicated by the offset register. 17. A handheld computer system device, comprising: a system memory;a central processing unit (CPU) coupled to the system memory; anda graphics processing unit (GPU) communicatively coupled to the CPU, wherein the GPU comprises a graphics pipeline for executing a shader program, and wherein the graphics pipeline is configured to:access a shader program stored in the system memory;generate a shader instruction from the shader program; andload the shader instruction into an instruction table associated with one of a plurality of stages of the graphics pipeline at a position indicated by an offset register, wherein the shader instruction is loaded responsive to a program sequencer sending the shader instruction, and wherein the one of the plurality of stages of the graphics pipeline comprises the offset register. 18. The handheld computer system device of claim 17, wherein the shader program stored in the system memory does not include an explicit instruction table offset. 19. The handheld computer system device of claim 17, wherein the graphics pipeline is further configured to update the offset register to indicate a next available position in the instruction table. 20. The handheld computer system device of claim 19, wherein the graphics pipeline is configured to: generate a second shader instruction from the shader program; andload a second shader instruction into the instruction table at the next available position indicated by the offset register. 21. The handheld computer system device of claim 19, wherein the graphics pipeline is configured to: access a second shader program stored in memory;generate a second shader instruction from the second shader program; andload the second shader instruction into the instruction table at the next available position indicated by the offset register.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (150)
Chiang Paul ; Ng Pius ; Look Paul, Accelerated multimedia processor.
MacInnis, Alexander G.; Tang, Chengfuh Jeffrey; Xie, Xiaodong; Patterson, James T.; Kranawetter, Greg A., Apparatus and method for blending graphics and video surfaces.
Harrell Chandlee B. (Mountain View CA), Apparatus and method for handling data transfer between a general purpose computer and a cooperating processor.
Ahmed, Ashraf; Filippo, Michael A.; Pickett, James K., Apparatus and method for independently schedulable functional units with issue lock mechanism in a processor.
Kuma Takao (Kawasaki JPX) Sakai Kenichi (Kawasaki JPX), Asymmetric vector multiprocessor composed of a vector unit and a plurality of scalar units each having a different archi.
Heng,Pheng Ann; Xie,Yongming; Wong,Tien Tsin; Chui,Yim Pan, Block-based fragment filtration with feasible multi-GPU acceleration for real-time volume rendering on conventional personal computer.
Asghar Saf ; Ireton Mark ; Bartkowiak John G., CPU with DSP function preprocessor having look-up table for translating instruction sequences intended to perform DSP fu.
Colglazier, Daniel J.; Dombrowski, Chris; Genduso, Thomas B., Cache for processing data in a memory controller and a method of use thereof to reduce first transfer latency.
Chen Steve S. (Chippewa Falls) Simmons Frederick J. (Neillsville) Spix George A. (Eau Claire) Wilson Jimmie R. (Eau Claire) Miller Edward C. (Eau Claire) Eckert Roger E. (Eau Claire) Beard Douglas R., Cluster architecture for a highly parallel scalar/vector multiprocessor system.
Tannenbaum David C. (Hurley NY) Schanely Paul M. (Hurley NY) Richardson Leland D. (Kingston NY) Hempel Bruce C. (Tivoli NY), Context management in a graphics system.
Apperley Norman (Chandlers Ford NY GBX) Edwards Roger J. (Woodstock NY) Foster Raymond L. J. (Landford GBX) Haigh David C. (Winchester GBX) Haslam Michael (Winchester GBX) Verey Peter (Winchester GBX, Data management for plasma display.
Oldfield William H. (Cambridgeshire GBX), Data memories and method for storing multiple categories of data in latches dedicated to particular category.
Nagashima, Shigeo; Torii, Shunichi; Omoda, Koichiro; Inagami, Yasuhiro, Data processing system including scalar data processor and vector data processor.
Ellis James P. (Hudson MA) Nangia Era (Marlboro MA) Patwa Nital (Hudson MA) Shah Bhavin (Mountain View CA) Wolrich Gilbert M. (Framingham MA), Digital computer system with cache controller coordinating both vector and scalar operations.
Richardson,John J., Driver framework component for synchronizing interactions between a multi-threaded environment and a driver operating in a less-threaded software environment.
Patti Michael F. (Plainsboro NJ) Fedele Nicola J. (Kingston NJ) Harney Kevin (Brooklyn NY) Simon Allen H. (Belle Mead NJ), Dual mode adder circuitry with overflow detection and substitution enabled for a particular mode.
Hilgendorf Rolf,DEX ; Schwermer Hartmut,DEX ; Soell Werner,DEX, Dynamic conversion between different instruction codes by recombination of instruction elements.
Bowhill William J. (Marlborough MA) Dickson Robert (Arlington MA) Durdan W. H. (Waban MA), Efficient protocol for communicating between asychronous devices.
Sweeney Michael A. (Manassas VA), Fast access priority queue for managing multiple messages at a communications node or managing multiple programs in a mu.
Ebrahim Zahir (Mountain View CA) Normoyle Kevin (San Jose CA) Nishtala Satyanarayana (Cupertino CA) Van Loo William C. (Palo Alto CA), Fast, dual ported cache controller for data processors in a packet switched cache coherent multiprocessor system.
Thayer Larry J. (Ft. Collins CO) Coleman Mark D. (Ft. Collins CO), Graphics system with programmable tile size and multiplexed pixel data and partial pixel addresses based on tile size.
Arimilli Ravi Kumar ; Dodson John Steven ; Lewis Jerry Don, High performance cache directory addressing scheme for variable cache sizes utilizing associativity.
Van Hook Timothy J. ; Cheng Howard H. ; DeLaurier Anthony P. ; Gossett Carroll P. ; Moore Robert J. ; Shepard Stephen J. ; Anderson Harold S. ; Princen John ; Doughty Jeffrey C. ; Pooley Nathan F. ; , High performance low cost video game system with coprocessor providing high speed efficient 3D graphics and digital audio signal processing.
Pfeiffer David M. (Plano TX) Stoner David T. (McKinney TX) Norsworthy John P. (Carrollton TX) Dipert Dwight D. (Richardson TX) Thompson Jay A. (Plano TX) Fontaine James A. (Plano TX) Corry Michael K., High speed image processing system using separate data processor and address generator.
Van Hook Timothy J. ; Moreton Henry P. ; Fuccio Michael L. ; Pryor ; Jr. Robert W. ; Tuffli ; III Charles F., Instruction methods for performing data formatting while moving data between memory and a vector register file.
Singh Gurbir ; Wang Wen-Hann ; Rhodehamel Michael W. ; Bauer John M. ; Sarangdhar Nitin V., Method and apparatus for cache memory replacement line identification.
Hall Michael L. (Marysville WA) Engel Glenn R. (Lake Stevens WA), Method and apparatus for dynamically linking subprogram to main program using tabled procedure name comparison.
Zatz, Harold Robert Feldman; Tannenbaum, David C., Method and apparatus for generation of programmable shader configuration information from state-based control information and program instructions.
Mills Karl Scott ; Holmes Jeffrey Michael ; Bonnelycke Mark Emil ; Owen Richard Charles Andrew, Method and apparatus for optimizing pixel data write operations to a tile based frame buffer.
Floyd, Michael Stephen; Kahle, James Allan; Le, Hung Qui; Moore, John Anthony; Reick, Kevin Franklin; Silha, Edward John, Method and apparatus for patching problematic instructions in a microprocessor using software interrupts.
Johl, Manraj Singh; Steinmetz, Joseph Harold; Wakeley, Matthew Paul, Method and system increasing performance substituting finite state machine control with hardware-implemented data structure manipulation.
Naegle, Nathaniel David; Sweeney, Jr., William E.; Morse, Wayne A., Method for context switching a graphics accelerator comprising multiple rendering pipelines.
Eichenberger,Alexandre E.; O'Brien,John Kevin Patrick; O'Brien,Kathryn M., Method to efficiently prefetch and batch compiler-assisted software cache accesses.
Shiell Jonathan H. ; Bosshart Patrick W., Microprocessor with circuits, systems, and methods for operating with patch micro-operation codes and patch microinstruction codes stored in multi-purpose memory structure.
Bakalash, Reuven; Leviathan, Yaniv, PC-level computing system with a multi-mode parallel graphics rendering subsystem employing an automatic mode controller, responsive to performance data collected during the run-time of graphics applications.
Gooding David N. (Endicott NY) Shimp Everett M. (Endwell NY), Parallel digital arithmetic device having a variable number of independent arithmetic zones of variable width and locati.
Chiarulli Donald M. (4724 Newcomb Dr. Baton Rouge LA 70808) Rudd W. G. (Dept. of Computer Science Oregon State University Corvallis OR 97331) Buell Duncan A. (1212 Chippenham Dr. Baton Rouge LA 70808, Processor utilizing reconfigurable process segments to accomodate data word length.
Beard Douglas R. (Eleva WI) Phelps Andrew E. (Eau Claire WI) Woodmansee Michael A. (Eau Claire WI) Blewett Richard G. (Altoona WI) Lohman Jeffrey A. (Eau Claire WI) Silbey Alexander A. (Eau Claire WI, Scalar/vector processor.
Shiell Jonathan H. ; Chen Ian, Single chip microprocessor circuits, systems, and methods for self-loading patch micro-operation codes and patch microi.
Moll,Laurent R.; Cheng,Yu Qing; Glaskowsky,Peter N.; Song,Seungyoon Peter, Small and power-efficient cache that can provide data for background DNA devices while the processor is in a low-power state.
Hahn Woo Jong,KRX ; Park Kyong,KRX ; Yoon Suk Han,KRX, Structure of processor having a plurality of main processors and sub processors, and a method for sharing the sub processors.
Guttag Karl M. (Sugar Land TX) Read Christopher J. (Houston TX) Poland Sydney W. (Katy TX) Gove Robert J. (Plano TX) Golston Jeremiah E. (Sugar Land TX), Transfer processor with transparency.
William N. Joy ; Marc Tremblay ; Gary Lauterbach ; Joseph I. Chamdani, Vertically and horizontally threaded processor with multidimensional storage for storing thread data.
Alexander,Gregory W.; Levitan,David S.; Sinharoy,Balaram; Starke,William J., Zero cycle penalty in selecting instructions in prefetch buffer in the event of a miss in the instruction cache.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.