A graphics system includes a graphics processor and a cache memory system. The graphics processor includes processing units that perform various graphics operations to render graphics images. The cache memory system may include fully configurable caches, partially configurable caches, or a combinati
A graphics system includes a graphics processor and a cache memory system. The graphics processor includes processing units that perform various graphics operations to render graphics images. The cache memory system may include fully configurable caches, partially configurable caches, or a combination of configurable and dedicated caches. The cache memory system may further include a control unit, a crossbar, and an arbiter. The control unit may determine memory utilization by the processing units and assign the configurable caches to the processing units based on memory utilization. The configurable caches may be assigned to achieve good utilization of these caches and to avoid memory access bottleneck. The crossbar couples the processing units to their assigned caches. The arbiter facilitates data exchanges between the caches and a main memory.
대표청구항▼
1. An apparatus comprising: a plurality of processing units arranged in a pipeline, the plurality of processing units configured to perform graphics operations to render graphics images;a plurality of caches configured to store data for the plurality of processing units;a crossbar configured to coup
1. An apparatus comprising: a plurality of processing units arranged in a pipeline, the plurality of processing units configured to perform graphics operations to render graphics images;a plurality of caches configured to store data for the plurality of processing units;a crossbar configured to couple the plurality of caches to the plurality of processing units;a control unit configured to ascertain memory utilization by the plurality of processing units and to pre-assign one or more of the plurality of caches to a selected processing unit of the plurality of processing units at the beginning of rendering a frame, image, or batch based on the memory utilization statistics, so that the one or more caches is coupled exclusively to the selected processing unit. 2. The apparatus of claim 1, wherein each of the plurality of caches is assignable to any one of a respective subset of the plurality of processing units. 3. The apparatus of claim 1, wherein the plurality of caches comprise one or more dedicated caches exclusively assigned to the processing unit and at least one configurable cache exclusively assignable to any one of the remaining processing units. 4. The apparatus of claim 3, wherein each configurable cache is assignable to any one of a respective subset of the remaining processing units. 5. The apparatus of claim 3, wherein the remaining processing units comprise a depth test engine and a texture mapping engine. 6. The apparatus of claim 1, wherein the control unit is configured to assign the plurality of caches for each graphics image to be rendered based on memory utilization for a prior graphics image. 7. The apparatus of claim 1, wherein the control unit is configured to ascertain memory utilization based on data requests by the processing units, cache hit/miss statistics, or a combination thereof. 8. The apparatus of claim 1, wherein the control unit is configured to detect changes in memory utilization by the plurality of processing units during rendering of an image and to re-assign the plurality of caches based on the detected changes in memory utilization. 9. The apparatus of claim 1, wherein the control unit is configured to exclusively assign the one or more of the plurality of caches to the processing unit based on memory utilization by a graphics application being executed. 10. The apparatus of claim 1, wherein the crossbar comprises a plurality of interface units, each interface unit configured to couple an associated processing unit to a set of caches assigned to the processing unit. 11. The apparatus of claim 10, wherein each interface unit comprises a state machine configured to determine whether data requested by the associated processing unit is stored in any one of the set of caches assigned to the processing unit. 12. The apparatus of claim 11, wherein the state machine for each interface unit receives cache hit/miss indicators from the plurality of caches and a control indicating the set of caches assigned to the associated processing unit. 13. The apparatus of claim 11, wherein the state machine for each interface unit is configured to fill one of the set of caches assigned to the associated processing unit when a cache miss occurs. 14. The apparatus of claim 1, wherein the plurality of caches are arranged in a hierarchical structure with at least two levels of caches. 15. The apparatus of claim 14, wherein at least one level in the hierarchical structure has a configurable number of caches. 16. The apparatus of claim 14, wherein at least one level in the hierarchical structure has configurable cache sizes. 17. The apparatus of claim 1, wherein the plurality of caches are arranged in a configurable number of levels in a hierarchical structure. 18. The apparatus of claim 1, wherein the plurality of caches have configurable cache sizes. 19. The apparatus of claim 1, further comprising: an arbiter coupled to the plurality of caches and configured to facilitate data exchanges between the plurality of caches and a main memory. 20. The apparatus of claim 1, wherein the plurality of processing units comprise a depth test engine and a texture mapping engine. 21. The apparatus of claim 20, wherein the plurality of processing units are arranged in a pipeline, and wherein the depth test engine is located earlier in the pipeline than the texture mapping engine. 22. An integrated circuit comprising: a plurality of processing units arranged in a pipeline, the plurality of processing units configured to perform graphics operations to render graphics images;a plurality of caches configured to store data for the plurality of processing units;a crossbar configured to couple the plurality of caches to the plurality of processing units; anda control unit configured to ascertain memory utilization by the plurality of processing units and to pre-assign one or more of the plurality of caches to a selected processing unit of the plurality of processing units at the beginning of rendering a frame, image, or batch based on the memory utilization statistics, so that the one or more caches is coupled exclusively to the selected processing unit. 23. A wireless device comprising: a graphics processor comprising a plurality of processing units arranged in a pipeline, the plurality of processing units configured to perform graphics operations to render graphics images; anda cache memory system comprising a plurality of caches configured to store data for the plurality of processing units, anda crossbar configured to couple the plurality of caches to the plurality of processing units; anda control unit configured to ascertain memory utilization by the plurality of processing units and to pre-assign one or more of the plurality of caches to a selected processing unit of the plurality of processing units at the beginning of rendering a frame, image, or batch based on the memory utilization statistics, so that the one or more caches is coupled exclusively to the selected processing unit. 24. The wireless device of claim 23, wherein the cache memory system further comprises an arbiter coupled to the plurality of caches and configured to facilitate data exchanges between the plurality of caches and a main memory. 25. A method comprising: determining memory utilization statistics by a plurality of processing units arranged in a pipeline, the plurality of processing units configured to perform graphics operations to render graphics images;pre-assigning a plurality of caches to a processing unit among the plurality of processing units at the beginning of rendering a frame, image, or batch based on the memory utilization statistics by the plurality of processing units; andexclusively coupling the processing unit to the plurality of caches based on the pre-assigning. 26. The method of claim 25, further comprising: coupling one or more caches directly to the processing unit among the plurality of processing units. 27. The method of claim 25, wherein the exclusively assigning the plurality of caches comprises exclusively assigning the plurality of caches to the processing unit for each graphics image to be rendered based on memory utilization for a prior graphics image. 28. An apparatus comprising: means for determining memory utilization statistics by a plurality of processing units arranged in a pipeline, the plurality of processing units configured to perform graphics operations to render graphics images;means for pre-assigning a plurality of caches to a processing unit among the plurality of processing units at the beginning of rendering a frame, image, or batch based on the memory utilization statistics by the plurality of processing units; andmeans for exclusively coupling the processing unit to the plurality of caches. 29. The apparatus of claim 28, further comprising: means for coupling one or more caches directly to the processing unit among the plurality of processing units. 30. The apparatus of claim 28, wherein the means for exclusively assigning the plurality of caches comprises means for exclusively assigning the plurality of caches to the processing unit for each graphics image to be rendered based on memory utilization for a prior graphics image. 31. A non-transitory computer-readable memory storing code for causing a computer to configure caches comprising: code for causing a computer to determine memory utilization statistics by a plurality of processing units arranged in a pipeline, the plurality of processing units configured to perform graphics operations to render graphics images;code for causing a computer to pre-assign a plurality of caches to a processing unit among the plurality of processing units at the beginning of rendering a frame, image, or batch based on the memory utilization statistics by the plurality of processing units; andcode for causing a computer to exclusively couple the processing unit to the plurality of caches based on the pre-assignment. 32. The non-transitory computer-readable memory of claim 31, further comprising: code for causing a computer to couple one or more caches directly to the processing unit among the plurality of processing units. 33. The non-transitory computer-readable memory of claim 31, wherein the code for causing a computer to exclusively assign the plurality of caches comprises: code for causing a computer to exclusively assign the plurality of caches to the processing unit for each graphics image to be rendered based on memory utilization for a prior graphics image.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (98)
Deolaliker Vikas S., Accumulation buffer method and apparatus for graphical image processing.
Samaniego, Christopher; Offner, Nelson H. Rocky; Thewlis, Adrian D.; Boyd, David R.; Salmon, David C.; Devan, Joshua N., Automated media delivery system.
Border, John N.; Enge, Amy D.; Morales, Efrain O.; Pillman, Bruce H.; Jacoby, Keith A.; Adams, Jr., James E.; Palum, Russell J.; Gallagher, Andrew C., Camera using multiple lenses and image sensors in a rangefinder configuration to provide a range map.
Meitav, Ohad; Seltz, Daniel; Shenberg, Itzhak, Digital camera with reduced image buffer memory and minimal processing for recycling through a service center.
Hoffberg, Steven M.; Hoffberg-Borghesani, Linda Irene, Ergonomic man-machine interface incorporating adaptive pattern recognition based control system.
Zook, Christopher P., Error detection convolution code and post processor for correcting dominant error events of a trellis sequence detector in a sampled amplitude read channel for disk storage systems.
Kitamura, Hiroki; Akiba, Yoshiyuki; Takata, Tsutomu; Nakamura, Shuichi; Yamamoto, Yusuke; Motoyama, Masanao; Akiyama, Takeshi; Tojima, Kenzo; Nagaoka, Takaaki, Image forming system with density conversion based on image characteristics and amount of color shift.
Ejima, Satoshi; Nozaki, Hirotake; Hiraide, Fumio, Image processing apparatus having image selection function, and recording medium having image selection function program.
Petolino ; Jr. Joseph Anthony ; Lynch William Lee ; Lauterbach Gary Raymond ; Narasimhaiah Chitresh Chandra, Latency prediction in a pipelined microarchitecture.
Longhenry Brian E. ; Thome Gary W. ; Thayer John S., MPEG motion compensation using operand routing and performing add and divide in a single instruction.
Xu,Jiangming; Chen,Wen Chung; Wang,Yuanfeng; Li,Liang; Brothers,John; Prokopenko,Boris, Method and apparatus for generating a shadow effect using shadow volumes.
Lindholm,John Erik; Bastos,Rui M.; Zatz,Harold Robert Feldman, Method and apparatus for multithreaded processing of data in a programmable graphics processor.
Joel S. Emer ; Rebecca L. Stamm ; Bruce E. Edwards ; Matthew H. Reilly ; Craig B. Zilles ; Tryggve Fossum ; Christopher F. Joerg ; James E. Hicks, Jr., Method and apparatus to quiesce a portion of a simultaneous multithreaded central processing unit.
Kahle James A. ; Mallick Soummya ; McDonald Robert G., Method and system for constructing a program including out-of-order threads and processor and method for executing threa.
Gupta, Sadhana; Kumar, Suvarna Harish; Easwar, Venkat V.; Ghose, Arunabha, Path to trapezoid decomposition of polygons for printing files in a page description language.
Hubertus Franke ; Mark Edwin Giampapa ; Joefon Jann ; Douglas James Joseph ; Pratap Chandra Pattnaik, Secure partitioning of shared memory based multiprocessor system.
Levy Henry M. ; Eggers Susan J. ; Lo Jack ; Tullsen Dean M., Shared register storage mechanisms for multithreaded computer systems with out-of-order execution.
Dennis A. Fielder GB; James H. Derbyshire CA; Peter B. Gillingham CA; Randy R. Torrance CA; Cormac M. O'Connell CA, Single chip frame buffer and graphics accelerator.
Apisdorf, Joel Zvi; Sandbote, Sam Brandon, System and method for instruction-level parallelism in a programmable multiple network processor environment.
Voorhies,Douglas A.; Van Dyke,James M.; Margeson, III,Jim E., System, method and article of manufacture for Z-value and stencil culling prior to rendering in a computer graphics processing pipeline.
Van Dyke, James M.; Voorhies, Douglas A.; Margeson, III, James E.; Montrym, John, System, method and article of manufacture for an interlock module in a computer graphics processing pipeline.
Bose, Vanu G.; Tennenhouse, David L.; Gutag, John V.; Ismert, Michael; Welborn, Matthew; Shah, Alok B., Systems and methods for wireless communications.
Ehara Hiroyuki,JPX ; Morii Toshiyuki,JPX, Voice encoder, voice decoder, recording medium on which program for realizing voice encoding/decoding is recorded and mobile communication apparatus.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.