[특허]Multistream processing memory-and barrier-synchronization method and apparatus

Multistream processing memory-and barrier-synchronization method and apparatus 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-012/00 G06F-009/52 G06F-009/46
출원번호	US-0643741 (2003-08-18)
등록번호	US-7437521 (2008-10-14)
발명자 / 주소	Scott,Steven L. Faanes,Gregory J. Stephenson,Brick Moore, Jr.,William T. Kohn,James R.
출원인 / 주소	Cray Inc.
대리인 / 주소	Schwegman, Lundberg & Woessner, P.A.
인용정보	피인용 횟수 : 38 인용 특허 : 84

초록 ▼

A method and apparatus to provide specifiable ordering between and among vector and scalar operations within a single streaming processor (SSP) via a local synchronization (Lsync) instruction that operates within a relaxed memory consistency model. Various aspects of that relaxed memory consistency model are described. Further, a combined memory synchronization and barrier synchronization (Msync) for a multistreaming processor (MSP) system is described. Also, a global synchronization (Gsync) instruction provides synchronization even outside a single MSP system is described. Advantageously, the pipeline or queue of pending memory requests does not need to be drained before the synchronization operation, nor is it required to refrain from determining addresses for and inserting subsequent memory accesses into the pipeline.

대표청구항 ▼

What is claimed is: 1. An apparatus comprising: a memory interface; a plurality of queues connected to the memory interface, including a first queue and a second queue, wherein each of the plurality of queues holds pending memory requests and enforces an ordering in the commitment of the pending memory requests to memory; one or more instruction-processing circuits, wherein each instruction-processing circuit is operatively coupled through the plurality of queues to the memory interface and wherein each of the plurality of instruction-processing circuits inserts one or more memory requests into at least one of the queues based on a first memory operation instruction, inserts a first synchronization marker into the first queue and inserts a second synchronization marker into the second queue based on a synchronization operation instruction and inserts one or more memory requests into at least one of the queues based on a second memory operation instruction; and a first synchronization circuit, operatively coupled to the first plurality of queues, that selectively halts processing of further memory requests from the first queue based on the first synchronization marker reaching a predetermined point in the first queue until the corresponding second synchronization marker reaches a predetermined point in the second queue; wherein each of the memory requests is a memory reference, wherein the memory reference is generated as a result of instructions by the instruction-processing circuits, wherein the first queue is used for only synchronization markers and vector memory references, and the second queue is used for only synchronization markers and scalar memory references, wherein the synchronization operation instruction is an Lsync V,S-type instruction, wherein the instruction-processing circuits include a data cache and wherein the Lsync V,S-type instruction prevents subsequent scalar references from accessing the data cache until all vector references have been sent to an external cache and all vector writes have caused any necessary invalidations of the data cache. 2. The apparatus of claim 1, wherein, for a second synchronization operation instruction, a corresponding synchronization marker is inserted to in only the first queue. 3. The apparatus of claim 2, wherein the second synchronization instruction is an Lsync-type instruction. 4. The apparatus of claim 1, wherein the first queue includes two subqueues, including a first subqueue and a second subqueue, wherein the first subqueue is for holding the vector memory references and synchronization markers associated with the vector memory references and wherein the second subqueue is for holding a plurality of store data elements and synchronization markers associated with the store data elements, wherein each store data element in the second subqueue corresponds to a one of the memory requests in the first subqueue, and wherein the store data elements are loaded into the second subqueue decoupled from the loading of the memory requests into the first subqueue. 5. A method comprising: providing a memory interface; providing a plurality of queues connected to the memory interface, including a first queue and a second queue, wherein each of first plurality of queues holds pending memory requests and enforces an ordering in the commitment of the pending memory requests to memory; providing one or more instruction-processing circuits, wherein each instruction-processing circuit is operatively coupled through the plurality of queues to the memory interface; inserting one or more memory requests into at least one of the queues based on a first memory operation instruction executed in one of the instruction-processing circuits; inserting a first synchronization marker into the first queue and inserting a second synchronization marker into the second queue based on a synchronization operation instruction executed in one of the instruction-processing circuits; inserting one or more memory requests into at least one of the queues based on a second memory operation instruction based on a second memory operation instruction executed in one of the instruction-processing circuits; processing memory requests from the first queue; and selectively halting further processing of memory requests from the first queue based on the first synchronization marker reaching a predetermined point in the first queue until the corresponding second synchronization marker reaches a predetermined point in the second queue; wherein each of the memory requests is a memory reference, wherein the first queue stores only synchronization markers and vector memory references, and wherein the second queue stores only synchronization markers and scalar memory references, wherein the synchronization operation instruction is an Lsync V,S-type instruction, wherein providing the instruction processing circuits includes providing a data cache and wherein performing the Lsync V,S type instruction includes preventing subsequent scalar references from accessing the data cache until all vector references have been sent to an external cache and all vector writes have caused any necessary invalidations of the data cache. 6. The method of claim 5, wherein the first queue includes two subqueues, including a first subqueue and a second subqueue, wherein the first subqueue stores the vector memory references and synchronization markers associated with the vector memory references and wherein the second subqueue stores a plurality of store data elements and synchronization markers associated with the store data elements, wherein each store data element in the second subqueue corresponds to one of the memory requests in the first subqueue, and wherein the store data elements are inserted into the second subqueue decoupled from the inserting of the memory requests into the first subqueue. 7. An apparatus comprising: a memory interface; a plurality of queues connected to the memory interface, including a first queue and a second queue, wherein each of the plurality of queues holds pending memory requests and enforces an ordering in the commitment of the pending memory requests to memory; one or more instruction-processing circuits, wherein each instruction-processing circuit is operatively coupled through the plurality of queues to the memory interface and wherein each of the plurality of instruction-processing circuits includes: means for inserting one or more memory requests into at least one of the queues based on a first memory operation instruction executed in one of the instruction-processing circuits; means for inserting a first synchronization marker into the first queue and inserting a second synchronization marker into the second queue based on a synchronization operation instruction executed in one of the instruction-processing circuits; means for inserting one or more memory requests into at least one of the queues based on a second memory operation instruction based on a second memory operation instruction executed in one of the instruction-processing circuits; means for processing memory requests from the first queue; and means for selectively halting further processing of memory requests from the first queue based on the first synchronization marker reaching a predetermined point in the first queue until the corresponding second synchronization marker reaches a predetermined point in the second queue; wherein each of the memory requests is a memory reference, wherein means for inserting to the first queue operates for only vector memory requests and synchronizations, and means for inserting to the second queue operates for only scalar memory requests and synchronizations, wherein the synchronization operation instruction is an Lsync V,S-type instruction, wherein the instruction-processing circuits include a data cache and wherein the Lsync V,S-type instruction prevents subsequent scalar references from accessing the data cache until all vector references have been sent to an external cache and all vector writes have caused any necessary invalidations of the data cache. 8. A system comprising: a plurality of processors, including a first processor and a second processor, wherein each of the processors includes: a memory interface; a plurality of Lsync queues connected to the memory interface, including a first Lsync queue and a second Lsync queue, wherein each of the plurality of Lsync queues holds pending memory requests and enforces an ordering in the commitment of the pending memory requests to memory; one or more instruction-processing circuits, wherein each instruction-processing circuit is operatively coupled through the plurality of Lsync queues to the memory interface and wherein each of the plurality of instruction-processing circuits inserts one or more memory requests into at least one of the Lsync queues based on a first memory operation instruction, inserts a first Lsync synchronization marker into the first Lsync queue and inserts a second Lsync synchronization marker into the second Lsync queue based on a synchronization operation instruction, and inserts one or more memory requests into at least one of the Lsync queues based on a second memory operation instruction; and a Lsync synchronization circuit, operatively coupled to the plurality of Lsync queues, that selectively halts processing of further memory requests from the first Lsync queue based on the first Lsync synchronization marker reaching a predetermined point in the first Lsync queue until the corresponding second Lsync synchronization marker reaches a predetermined point in the second Lsync queue; and one or more Msync circuits, wherein each of the Msync circuits is connected to the plurality of processors and wherein each of the Msync circuits includes: a plurality of Msync queues, including a first Msync queue and a second Msync queue, each of the plurality of Msync queues for holding a plurality of pending memory requests received from the Lsync queues, wherein the first Msync queue stores only Msync synchronization markers and memory requests from the first processor, and the second Msync queue stores only Msync synchronization markers and memory requests from the second processor; and an Msync synchronization circuit, operatively coupled to the plurality of Msync queues, that selectively halts further processing of the memory requests from the first Msync queue based on an Msync synchronization marker reaching a predetermined point in the first Msync queue until a corresponding Msync synchronization marker from the second processor reaches a predetermined point in the second Msync queue; wherein each of the memory requests is a memory reference, wherein the memory reference is generated as a result of execution of instructions by instruction-processing circuits in each processor, wherein each processor includes a data cache and wherein each Msync synchronization circuit includes an external cache, wherein the data cache and the external cache are used to perform an Lsync V,S type instruction, wherein the Lsync V,S type instruction prevents subsequent scalar references from accessing the data cache until all vector references have been sent to the external cache in a corresponding Msync synchronization circuit and all vector writes have caused any necessary invalidations of the data cache. 9. The system of claim 8, wherein the Msync synchronization circuit includes a plurality of stall lines, wherein each of the stall lines is connected to one of the plurality of Msync queues and wherein each of the stall lines is for halting further processing of the memory requests from a corresponding Msync queue. 10. A method comprising: providing a plurality of processors, including a first processor and a second processor, wherein each of the processors includes a memory interface, a plurality of Lsync queues connected to the memory interface, including a first Lsync queue and a second Lsync queue, wherein each of the plurality of Lsync queues holds pending memory requests and enforces an ordering in the commitment of the pending memory requests to memory, and one or more instruction-processing circuits, each of the instruction-processing circuits operatively coupled through the plurality of Lsync queues to the memory interface; providing one or more Msync circuits, wherein each of the Msync circuits is connected to the plurality of processors and wherein each of the Msync circuits includes a plurality of Msync queues, including a first Msync queue and a second Msync queue, each of the plurality of Msync queues operatively coupled to the plurality of Lsync queues in one of the plurality of processors; inserting one or more memory requests into at least one of the Lsync queues based on a first memory operation instruction executed in one of the instruction-processing circuits; inserting a first Lsync synchronization marker into the first Lsync queue and inserting a second Lsync synchronization marker into the second Lsync queue based on a synchronization operation instruction executed in one of the instruction-processing circuits; inserting one or more memory requests into at least one of the Lsync queues based on a second memory operation instruction executed in one of the instruction-processing circuits; processing memory requests from the first Lsync queue; selectively halting further processing of memory requests from the first Lsync queue based on the first Lsync synchronization marker reaching a predetermined point in the first Lsync queue until the corresponding second Lsync synchronization marker reaches a predetermined point in the second Lsync queue, inserting Msync synchronization markers and memory requests received from the Lsync queues in the first processor into the first Msync queue; inserting Msync synchronization markers and memory requests received from the Lsync queues in the second processor into the second Msync queue; and selectively halting further processing of the memory requests from the first Msync queue based on an Msync synchronization marker reaching a predetermined point in the first Msync queue until a corresponding Msync synchronization marker from the second processor reaches a predetermined point in the second Msync queue; wherein each of the memory requests is a memory reference, wherein selectively halting further processing of memory requests from the first Lsync queue includes performing an Lsync V,S type instruction, wherein performing the Lsync V,S type instruction includes preventing subsequent scalar references from accessing a data cache in the processor until all vector references have been sent to an external cache in a corresponding Msync synchronization circuit and all vector writes have caused any necessary invalidations of the data cache. 11. The method of claim 10, wherein selectively halting further processing of the memory requests from the first Msync queue includes sending a stall signal to the Msync queues.

이 특허에 인용된 특허 (84)

Nugent Steven F. (Portland OR), Adaptive message routing for multi-dimensional networks.
상세보기
Blasbalg Herman (Gaithersburg MD), Adaptive packet length traffic control in a local area network.
상세보기
Bruckert William (Northboro MA) Bissett Thomas D. (Derry NH) Kovalcin David (Grafton MA) Nene Ravi (Chelmsford MA), Apparatus and method for documenting faults in computing modules.
상세보기
Hashimoto Shin,JPX ; Masaki Reiji,JPX, Apparatus for analyzing operations of parallel processing system.
상세보기
Barnes George H. (Wayne PA) Lundstrom Stephen F. (Wayne PA) Shafer Philip E. (Holmes PA), Array processor architecture.
상세보기
Leedom George W. ; Moore William T., Associative scalar data cache with write-through capabilities for a vector processor.
상세보기
Vishin Sanjay ; Aybay Gunes, Auxiliary translation lookaside buffer for assisting in accessing data in remote address spaces.
상세보기
Kessler Richard E. ; Oberlin Steven M. ; Thorson Gregory M., Barrier and eureka synchronization architecture for multiprocessors.
상세보기
Oberlin Steven M. (Chippewa Falls WI) Fromm Eric C. (Eau Claire WI), Barrier synchronization for distributed memory massively parallel processing systems.
상세보기
Ishizaka Kenichi,JPX, Barrier synchronization system in parallel data processing.
상세보기
McMahan Steven C., Branch processing unit with target cache read prioritization protocol for handling multiple hits.
상세보기
Shibata Masabumi,JPX ; Nakajima Atsushi,JPX ; Fujiwara Shisei,JPX, Cache coherency control method and multi-processor system using the same.
상세보기
Koyanagi, Hisao, Cache consistent control of subsequent overlapping memory access during specified vector scatter instruction execution.
상세보기
Chang, Stephen S., Cache states for multiprocessor cache coherency protocols.
상세보기
Hall Barbara A. (Endwell NY) Huang Kevin C. (Endicott NY) Jabusch John D. (Endwell NY) Ngai Agnes Y. (Endwell NY), Central processing unit checkpoint retry for store-in and store-through cache systems.
상세보기
Buchholz Dale R. (Palatine IL), Channel access control in a communication system.
상세보기
Chen Steve S. (Chippewa Falls) Simmons Frederick J. (Neillsville) Spix George A. (Eau Claire) Wilson Jimmie R. (Eau Claire) Miller Edward C. (Eau Claire) Eckert Roger E. (Eau Claire) Beard Douglas R., Cluster architecture for a highly parallel scalar/vector multiprocessor system.
상세보기
Whaley Kenneth M. ; Tarolli Gary, Command data transport to a graphics processing device from a CPU performing write reordering operations.
상세보기
Nagai Yasuhiro (Bunkyo JPX) Sasaki Ryoichi (Fujisawa JPX) Suzuki Michio (Yokohama NY JPX) Yosioka Shunichi (New York NY) Mizuhara Noboru (Kawasaki JPX), Communication circuit switching or parallel operation system.
상세보기
Mendelsohn Noah R. (Arlington MA) Perchik James (Cambridge MA) Hancock Thomas R. (Somerville MA), Component replacement control for fault-tolerant data processing system.
상세보기
Le Boudec Jean-Yves (Adliswil CHX) Truong Linh (Gattikon CHX), Connectionless ATM data services.
상세보기
Nagashima, Shigeo; Torii, Shunichi; Omoda, Koichiro; Inagami, Yasuhiro, Data processing system including scalar data processor and vector data processor.
상세보기
Papadopoulos Gregory M. (Acton MA) Nikhil Rishiyur S. (Arlington MA) Greiner Robert J. (Chandler AZ) Arvind (Arlington MA), Data processing system with synchronization coprocessor for multiple threads.
상세보기
Papadopoulos Gregory M. (Burlington MA) Nikhil Rishiyur S. (Arlington MA) Greiner Robert J. (Chandler AZ) Arvind (Arlington MA), Data processing system with synchronization coprocessor for multiple threads.
상세보기
Easki Hiroshi (Yokohama JPX) Natsubori Shigeyasu (Yokohama JPX) Saito Takeshi (Tokyo JPX) Tsuda Yoshiyuki (Kawasaki JPX) Matsuzawa Shigeo (Tokyo JPX), Data-transfer routing management for packet-oriented digital communication system including ATM networks.
상세보기
Faanes,Gregory J.; Scott,Steven L.; Lundberg,Eric P.; Moore, Jr.,William T.; Johnson,Timothy J., Decoupled scalar/vector computer architecture system and method.
상세보기
Morton Steven G., Digital signal processor containing scalar processor and a plurality of vector processors operating from a single instruction.
상세보기
Ogura Takao (Kawasaki JPX) Amemiya Shigeo (Kawasaki JPX) Tezuka Koji (Kawasaki JPX) Chujo Takafumi (Kawasaki JPX), Distributed control of telecommunication network for setting up an alternative communication path.
상세보기
Ben-Ayed Mondher (Rochester NY) Merriam Charles W. (Rochester NY), Dynamic routing system for a multinode communications network.
상세보기
Madan Herb. S. (Marina del Rey CA) Chow Edward (San Dimas CA), Fault tolerant hypercube computer system architecture.
상세보기
Tsuchiya Paul F. (Lake Hopatcong NJ), General internet method for routing packets in a communications network.
상세보기
Shu Renben (St. Paul MN) Du David H. C. (New Brighton MN), Improved hypercube topology for multiprocessor computer systems.
상세보기
Kohn,James R., Indirectly addressed vector load-operate-store method and apparatus.
상세보기
Flaig Charles M. (Pasadena CA) Seitz Charles L. (San Luis Rey CA), Inter-computer message routing system with each computer having separate routinng automata for each dimension of the net.
상세보기
Thomas Basil Smith, III ; Robert Brett Tremaine, Memory system for permitting simultaneous processor access to a cache line and sub-cache line sectors fill and writeback to a system memory.
상세보기
Carter Nicholas P. ; Keckler Stephen W. ; Dally William J., Memory system with global address translation.
상세보기
Nugent Steven F. (Portland OR), Message routing in a multiprocessor computer system.
상세보기
Beard Douglas R. (Eleva WI) Phelps Andrew E. (Eau Claire WI) Woodmansee Michael A. (Eau Claire WI) Blewett Richard G. (Altoona WI) Lohman Jeffrey A. (Eau Claire WI) Silbey Alexander A. (Eau Claire WI, Method and apparatus for chaining vector instructions.
상세보기
Drysdale, Tracy Garrett; Bobholz, Scott P, Method and apparatus for communicating between processing entities in a multi-processor.
상세보기
Peterson John C. (Alta Loma CA) Chow Edward (San Dimas CA) Madan Herb S. (Marina del Rey CA), Method and apparatus for eliminating unsuccessful tries in a search tree.
상세보기
Shailender Chaudhry ; Marc Tremblay ; James M. O'Connor, Method and apparatus for enforcing memory reference dependencies through a load store unit.
상세보기
Dion Rodgers ; Darrell Boggs ; Amit Merchant ; Rajesh Kota ; Rachel Hsu ; Keshavan Tiruvallur, Method and apparatus for processing an event occurrence within a multithreaded processor.
상세보기
Fossum Tryggve (Northboro MA) Hetherington Ricky C. (Northboro MA) Fite ; Jr. David B. (Northboro MA) Manley Dwight P. (Holliston MA) McKeen Francis X. (Westboro MA) Murray John E. (Acton MA), Method and apparatus using a cache and main memory for both vector processing and scalar processing by prefetching cache.
상세보기
Seznec, Andre C., Method for ensuring maximum bandwidth on accesses to strided vectors in a bank-interleaved cache.
상세보기
Rolfe David B. (West Hurley NY), Method for interconnecting and system of interconnected processing elements by controlling network density.
상세보기
Chujo Takafumi (Hachiouji JPX) Komine Hiroaki (Yamato JPX) Miyazaki Keiji (Kawasaki JPX) Ogura Takao (Kawasaki JPX) Soejima Tetsuo (Tama JPX), Method for searching for alternate path in communication network.
상세보기
Shiojiri Hirohisa (Tokyo JPX) Koga Toshio (Tokyo JPX), Method of adaptively multiplexing a plurality of video channel data using channel data assignment information obtained f.
상세보기
Neches Philip M. (Pasadena CA), Multi processor sorting network for sorting while transmitting concurrently presented messages by message content to del.
상세보기
Mori Kinji (Yokohama JPX) Miyamoto Shoji (Kawasaki JPX) Ihara Hirokazu (Machida JPX), Multi-dimensional structured computer system.
상세보기
Barrett Linda (Raleigh NC) Long Lynn D. (Chapel Hill NC) Menditto Louis F. (Raleigh NC) Stagg Arthur J. (Raleigh NC) Ward Raymond E. (Durham NC), Multi-path channel (MPC) interface with user transparent, unbalanced, dynamically alterable computer input/output channe.
상세보기
den Haan, Petrus A. M.; Hopmans, Franciscus P. M., Multi-processor computer system with distributed memory and an interprocessor communication mechanism, and method for operating such mechanism.
상세보기
Baum Richard I. (Poughkeepsie NY) Brotman Charles H. (Poughkeepsie NY) Rymarczyk James W. (Poughkeepsie NY), Multiprocessing packet switching connection system having provision for error correction and recovery.
상세보기
Yamazaki Takeshi (Tokyo JPX), Multiprocessor system for locally managing address translation table.
상세보기
Frink Craig R. (Chelmsford MA) Bryg William R. (Saratoga CA) Chan Kenneth K. (San Jose CA) Hotchkiss Thomas R. (Groton MA) Odineal Robert D. (Roseville CA) Williams James B. (Lowell MA) Ziegler Micha, Multiprocessor system for maintaining cache coherency by checking the coherency in the order of the transactions being i.
상세보기
Nesheim William A. ; Guzovskiy Aleksandr, Multiprocessor system having mapping table in each node to map global physical addresses to local physical addresses of.
상세보기
Deneau, Thomas M., Multiprocessor system implementing virtual memory using a shared memory, and a page replacement method for maintaining paged memory coherence.
상세보기
Teraslinna Kari T. (Boulder CO), N+K sparing in a telecommunications switching environment.
상세보기
Barlow,Stephen; Bailey,Neil; Ramsdale,Timothy; Plowman,David; Swann,Robert, Narrow/wide cache.
상세보기
Baror Gigy, Organization of an integrated cache unit for flexible usage in supporting multiprocessor operations.
상세보기
Ogura Takao (Kawasaki JPX) Amemiya Shigeo (Kawasaki JPX) Tezuka Koji (Kawasaki JPX) Chujo Takafumi (Kawasaki JPX), Packet directional path identifier transfer system.
상세보기
Pierce Paul R. (Portland OR), Parallel processing system virtual connection method and apparatus with protection and flow control.
상세보기
Bowles James E., Reducing cache snooping overhead in a multilevel cache system with inclusion field in shared cache indicating state of.
상세보기
Scott, Steven L.; Dickson, Chris; Fromm, Eric C.; Anderson, Michael L., Remote address translation in a multiprocessor system.
상세보기
Scott, Steven L., Remote translation mechanism for a multi-node system.
상세보기
Childs Philip L. (Endicott NY) Olnowich Howard T. (Endicott NY) Skovira Joseph F. (Binghamton NY), SYNC-NET- a barrier synchronization apparatus for multi-stage networks.
상세보기
Nickolls John R. (Los Altos CA) Zapisek John (Cupertino CA) Kim Won S. (Fremont CA) Kalb Jeffery C. (Saratoga CA) Blank W. Thomas (Palo Alto CA) Wegbreit Eliot (Palo Alto CA) Van Horn Kevin (Mountain, Scalable processor to processor and processor-to-I/O interconnection network and method for parallel processing arrays.
상세보기
Beard Douglas R. (Eleva WI) Phelps Andrew E. (Eau Claire WI) Woodmansee Michael A. (Eau Claire WI) Blewett Richard G. (Altoona WI) Lohman Jeffrey A. (Eau Claire WI) Silbey Alexander A. (Eau Claire WI, Scalar/vector processor.
상세보기
Dunning Dave (Portland OR), Self-timed mesh routing chip with data broadcasting.
상세보기
Nakazato, Satoshi, Shared memory type vector processing system, including a bus for transferring a vector processing instruction, and control method thereof.
상세보기
Meyers Steven D. (Hurley NY) Ngo Hung C. (Kingston NY) Schwartz Paul R. (Kingston NY), Single register arbiter circuit.
상세보기
DeLano Eric R. ; Buckley Michael A. ; Weir Duncan C., Software assisted hardware TLB miss handler.
상세보기
Dutton Patrick Francis ; Gregor Steven Lee ; Li Hehching Harry, Storage subsystem including an error correcting cache and means for performing memory to memory transfers.
상세보기
Schimmel Curt F., System and method for maintaining translation look-aside buffer (TLB) consistency.
상세보기
David Parks, System and method providing cache coherency and atomic memory operations in a multiprocessor computer architecture.
상세보기
Horie Takeshi (Kawasaki JPX) Ikesaka Morio (Yokohama JPX) Ishihata Hiroaki (Tokyo JPX), System for controlling communication between parallel computers.
상세보기
Richard L. Frank ; Gopalan Arun ; Michael J. Cusson ; Daniel E. O'Shaughnessy, System for efficiently maintaining translation lockaside buffer consistency in a multi-threaded, multi-processor virtual memory system.
상세보기
Sakai Kenichi (Yohohama JPX), System for releasing suspended execution of scalar instructions following a wait instruction immediately upon change of.
상세보기
Stone Harold S. (Chappaqua NY), Technique for parallel synchronization.
상세보기
Dally William J. (Arlington MA) Seitz Charles L. (San Luis Rey CA), Torus routing chip.
상세보기
Faanes, Gregory J.; Lundberg, Eric P., Vector and scalar data cache for a vector multiprocessor.
상세보기
Gregory J. Faanes ; Eric P. Lundberg, Vector and scalar data cache for a vector multiprocessor.
상세보기
Kamiya Yasuaki (Tokyo JPX), Vector processing system for invalidating scalar cache memory block indicated by address in tentative vector store instr.
상세보기
Hansen Craig C., Virtual memory system with local and global virtual address translation.
상세보기
Van Loo William C. (Palo Alto CA) Ebrahim Zahir (Mountain View CA) Nishtala Satyanarayana (Cupertino CA) Normoyle Kevin (San Jose CA) Loewenstein Paul (Palo Alto CA) Coffin ; III Louis F. (San Jose C, Writeback cancellation processing system for use in a packet switched cache coherent multiprocessor system.
상세보기

이 특허를 인용한 특허 (38)

Ohlgren, Harry Carl Håkan; Lindquist, Carl Tobias, Allocating audio processing among a plurality of processing units with a global synchronization pulse.
상세보기
Godard, Roger Rawson; Kahlich, Arthur David; Schukat, Jan, CPU security mechanisms employing thread-specific protection domains.
상세보기
Morris, Terrel, Computer system and method for sharing computer memory.
상세보기
Morris, Terrel, Computer system and method for sharing computer memory.
상세보기
Steinmacher-Burow, Burkhard, Conditional access with timeout.
상세보기
Voigt, Douglas L., Coordinating replication of data stored in a non-volatile memory-based system.
상세보기
Scott, Steven L.; Faanes, Gregory J., Decoupling of write address from its associated write data in a store to a shared memory in a multiprocessor system.
상세보기
Reynolds, Nathan, Dynamic atomic bitsets.
상세보기
Guthrie, Guy L.; Helterhoff, Harmony L.; Jeremiah, Thomas L.; Ng, Alvan W.; Starke, William J.; Stuecheli, Jeffrey A.; Williams, Philip G., Empirically based dynamic control of acceptance of victim cache lateral castouts.
상세보기
Cargnoni, Robert A.; Guthrie, Guy L.; Helterhoff, Harmony L.; Starke, William J.; Stuecheli, Jeffrey A.; Williams, Phillip G., Empirically based dynamic control of transmission of victim cache lateral castouts.
상세보기
Hughes, Christopher J.; Chen, Yen-Kuang (Y. K.); Bomb, Mayank; Brandt, Jason W.; Buxton, Mark J.; Charney, Mark J.; Chennupaty, Srinivas; Corbal, Jesus; Dixon, Martin G.; Girkar, Milind B.; Hall, Jonathan C.; Ido, Hideki (Saito); Lachner, Peter; Neiger, Gilbert; Newburn, Chris J.; Parthasarathy, Rajesh S.; Toll, Bret L.; Valentine, Robert; Wiedemeier, Jeffrey G., Gathering and scattering multiple data elements.
상세보기
Hughes, Christopher J.; Chen, Yen-Kuang (Y. K.); Bomb, Mayank; Brandt, Jason W.; Buxton, Mark J.; Charney, Mark J.; Chennupaty, Srinivas; Corbal, Jesus; Dixon, Martin G.; Girkar, Milind B.; Hall, Jonathan C.; Ido, Hideki (Saito); Lachner, Peter; Neiger, Gilbert; Newburn, Chris J.; Parthasarathy, Rajesh S.; Toll, Bret L.; Valentine, Robert; Wiedemeier, Jeffrey G., Gathering and scattering multiple data elements.
상세보기
Guthrie, Guy L.; Ng, Alvan W.; Siegel, Michael S.; Starke, William J.; Williams, Derek E.; Williams, Phillip G., Handling castout cache lines in a victim cache.
상세보기
Ould-Ahmed-Vall, Elmoustapha; Doshi, Kshitij A.; Sair, Suleyman; Yount, Charles R., Instruction and logic to provide stride-based vector load-op functionality with mask duplication.
상세보기
Ould-Ahmed-Vall, Elmoustapha; Doshi, Kshitij A.; Sair, Suleyman; Yount, Charles R., Instruction and logic to provide vector loads with strides and masking functionality.
상세보기
Scott, Steven L., Latency tolerant distributed shared memory multiprocessor computer.
상세보기
Guthrie, Guy L.; Ng, Alvan W.; Siegel, Michael S.; Starke, William J.; Williams, Derek E.; Williams, Phillip G., Lateral cache-to-cache cast-in.
상세보기
Guthrie, Guy L.; Le, Hien M.; Ng, Alvan W.; Siegel, Michael S.; Williams, Derek E.; Williams, Phillip G., Lateral castout (LCO) of victim cache line in data-invalid state.
상세보기
Kohn, James R., Method and apparatus for indirectly addressed vector load-add-store across multi-processors.
상세보기
Guthrie, Guy L.; Helterhoff, Harmony L.; Starke, William J.; Williams, Phillip G.; Stuecheli, Jeffrey A., Mode-based castout destination selection.
상세보기
Bent, John M.; Faibish, Sorin; Grider, Gary, Partitioned key-value store with atomic memory operations.
상세보기
Sprangle, Eric; Rohillah, Anwar; Cavin, Robert; Forsyth, Tom; Abrash, Michael, Processor and system using a mask register to track progress of gathering and prefetching elements from memory.
상세보기
Lang, Christian Alexander; Mihaila, George Andrei; Stanoi, Ioana Roxana, Providing consistency in processing data streams.
상세보기
Scott, Steven L.; Faanes, Gregory J.; Stephenson, Brick; Moore, Jr., William T.; Kohn, James R., Relaxed memory consistency model.
상세보기
Sheets, Kitrick; Hastings, Andrew B., Remote translation mechanism for a multinode system.
상세보기
Guthrie, Guy L.; Starke, William J.; Stuecheli, Jeffrey; Williams, Derek E.; Puzak, Thomas R., Selective cache-to-cache lateral castouts.
상세보기
Blocksome, Michael; Dozsa, Gabor; Gooding, Thomas M.; Heidelberger, Philip; Kumar, Sameer; Mamidala, Amith R.; Miller, Douglas, Shared address collectives using counter mechanisms.
상세보기
Faanes, Gregory J.; Lundberg, Eric P.; Scott, Steven L.; Baird, Robert J., System and method for processing memory instructions using a forced order queue.
상세보기
Sprangle, Eric; Rohillah, Anwar; Cavin, Robert; Forsyth, Andrew T.; Abrash, Michael, System and method for using a mask register to track progress of gathering and scattering elements between data registers and memory.
상세보기
Biswas, Sukalpa; Shiu, Shinye; Wang, James, System cache with cache hint control.
상세보기
Daly, Jr., George William; Guthrie, Guy Lynn; Leavens, Ross Boyd; McDonald, Joseph Gerald; Siegel, Michael Steven; Starke, William John; Williams, Derek Edward, Techniques for write-after-write ordering in a coherency managed processor system that employs a command pipeline.
상세보기
Eichenberger, Alexandre E.; Gschwind, Michael K.; Salapura, Valentina, Vector loads with multiple vector elements from a same cache line in a scattered load operation.
상세보기
Ge, Yi; Takebe, Yoshimasa; Takahashi, Hiromasa, Vector processing circuit, command issuance control method, and processor system.
상세보기
Guthrie, Guy L.; Siegel, Michael S.; Starke, William J.; Williams, Derek E., Victim cache lateral castout targeting.
상세보기
Guthrie, Guy L.; Starke, William J.; Stuecheli, Jeffrey A.; Williams, Phillip G., Victim cache prefetching.
상세보기
Guthrie, Guy L.; Jeremiah, Thomas L.; McNeil, William L.; Patel, Piyush C.; Starke, William J.; Stuecheli, Jeffrey A., Victim cache replacement.
상세보기
Arimilli, Ravi K.; Guthrie, Guy L.; Cargnoni, Robert A.; Starke, William J.; Williams, Derek E., Virtual barrier synchronization cache.
상세보기
Arimilli, Ravi K.; Guthrie, Guy L.; Siegel, Michael; Starke, William J.; Williams, Derek E., Virtual barrier synchronization cache castout election.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Multistream processing memory-and barrier-synchronization method and apparatus 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (84)

이 특허를 인용한 특허 (38)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Multistream processing memory-and barrier-synchronization method and apparatus 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (84)

이 특허를 인용한 특허 (38)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트