[특허]Runtime system for executing an application in a parallel-processing computer system

Runtime system for executing an application in a parallel-processing computer system 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-009/45
출원번호	US-0714591 (2007-03-05)
등록번호	US-8381202 (2013-02-19)
발명자 / 주소	Papakipos, Matthew N. Demetriou, Christopher G. Tuck, Nathan D. Grant, Brian K.
출원인 / 주소	Google Inc.
인용정보	피인용 횟수 : 6 인용 특허 : 46

초록 ▼

A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.

대표청구항 ▼

1. A computer-implemented method, comprising: at a parallel-processing computer system that includes two or more processing elements including a first processing element of a first architecture and a second processing element of a second architecture different from the first architecture: at the first processing element: receiving one or more compute kernels, wherein the one or more compute kernels are configured to execute on the two or more processing elements; anddynamically arranging for execution of at least one of the one or more compute kernels on at least one of the two or more processing elements in response to or in anticipation of a request for a result associated with the at least one of the one or more compute kernels, further comprising:arranging the execution of the at least one of the one or more compute kernels on the second processing element after receiving the request for a result associated with the at least one of the one or more compute kernels; andreceiving a callback function from the second processing element, wherein the call back function includes a completion signal before a completion of executing the at least one of the one or more compute kernels on the second processing element;after determining that the second processing element is unavailable for executing the at least one of the one or more compute kernels: identifying another compute kernel among the one or more compute kernels, wherein the another compute kernel is an equivalent version of the at least one of the one or more compute kernels prepared for the first compute kernel; andarranging the execution of the another compute kernel on the first processing element. 2. The method of claim 1, further comprising: prior to receiving the one or more compute kernels, receiving one or more operation requests;dynamically selecting at least one of the two or more processing elements for the one or more operation requests; anddynamically preparing the one or more compute kernels for the one or more operation requests. 3. The method of claim 2, further comprising: generating a programming language-independent, processor-independent intermediate representation for the one or more operation requests. 4. The method of claim 2, wherein the one or more operation requests are from an application being executed on the parallel-processing computer system. 5. The method of claim 2, wherein said dynamic execution of the compute kernels is triggered by an operation request subsequent to the one or more operation requests. 6. The method of claim 1, wherein the two or more processing elements include single-core/multi-core central processing units, graphics processing units, or single-core/multi-core co-processors. 7. The method of claim 1, further comprising: generating a pending operation table;for each of the one or more compute kernels, inserting one or more entries into the pending operation table;in response to or in anticipation of the request for a result associated with the at least one of the one or more compute kernels, updating entries associated with the at least one of the one or more compute kernels in the pending operation table before the execution of the at least one of the one or more compute kernels; andremoving the updated entries from the pending operation table after the execution of the at least one of the one or more compute kernels. 8. The method of claim 1, wherein the first processing element is a CPU and the second processing element is a GPU. 9. A parallel-processing computer system, comprising: memory;two or more processing elements including a first processing element of a first architecture and a second processing element of a second architecture different from the first architecture; andat least one program stored in the memory and executed by the two or more processing elements, the at least one program including:instructions performed by the first processing element for receiving one or more compute kernels, wherein the one or more compute kernels are configured to execute on the two or more processing elements; andinstructions performed by the first processing element for dynamically arranging for execution of at least one of the one or more compute kernels on at least one of the two or more processing elements in response to or in anticipation of a request for a result associated with the at least one of the one or more compute kernels, further comprising: instructions for arranging the execution of the at least one of the one or more compute kernels on the second processing element after receiving the request for a result associated with the at least one of the one or more compute kernels; andinstructions for receiving a callback function from the second processing element, wherein the callback function includes a completion signal before a completion of executing the at least one of the one or more compute kernels on the second processing element;instructions for, after determining that the second processing element is unavailable for executing the at least one of the one or more compute kernels: identifying another compute kernel among the one or more compute kernels, wherein the another compute kernel is an equivalent version of the at least one of the one or more compute kernels prepared for the first compute kernel; andarranging the execution of the another compute kernel on the first processing element. 10. The computer system of claim 9, further comprising: instructions for generating a programming language-independent, processor-independent intermediate representation for the one or more operation requests. 11. The computer system of claim 10, wherein the one or more operation requests are from an application being executed on the parallel-processing computer system. 12. The computer system of claim 10, wherein said dynamic execution of the compute kernels is triggered by an operation request subsequent to the one or more operation requests. 13. The computer system of claim 9, wherein the two or more processing elements include single-core/multi-core central processing units, graphics processing units, or single-core/multi-core co-processors. 14. The computer system of claim 9, wherein the request is associated with another compute kernel that will be suspended until after the execution of the at least one of the one or more compute kernels. 15. The computer system of claim 14, wherein the result associated with the at least one of the one or more compute kernels is an input to the another compute kernel. 16. The computer system of claim 9, further comprising: instructions for generating a pending operation table;instructions for, for each of the one or more compute kernels, inserting one or more entries into the pending operation table;instructions for, in response to or in anticipation of the request for a result associated with the at least one of the one or more compute kernels, updating entries associated with the at least one of the one or more compute kernels in the pending operation table before the execution of the at least one of the one or more compute kernels; andinstructions for removing the updated entries from the pending operation table after the execution of the at least one of the one or more compute kernels. 17. The computer system of claim 9, wherein the first processing element is a CPU and the second processing element is a GPU. 18. A computer program product for use in conjunction with a parallel-processing computer system, the computer program product comprising a computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising: at the parallel-processing computer system that includes two or more processing elements including a first processing element of a first architecture and a second processing element of a second architecture different from the first architecture, instructions performed by the first processing element for receiving one or more compute kernels, wherein the one or more compute kernels are configured to execute on the two or more processing elements; andinstructions performed by the first processing element for dynamically arranging for execution of at least one of the one or more compute kernels on at least one of the two or more processing elements in response to or in anticipation of a request for a result associated with the at least one of the one or more compute kernels, further comprising:instructions for arranging the execution of the at least one of the one or more compute kernels on the second processing element after receiving the request for a result associated with the at least one of the one or more compute kernels; andinstructions for receiving a callback function from the second processing element, wherein the callback function includes a completion signal before a completion of executing the at least one of the one or more compute kernels on the second processing element;instructions for, after determining that the second processing element is unavailable for executing the at least one of the one or more compute kernels:identifying another compute kernel among the one or more compute kernels, wherein the another compute kernel is an equivalent version of the at least one of the one or more compute kernels prepared for the first compute kernel; andarranging the execution of the another compute kernel on the first processing element. 19. The method of claim 1, wherein the request is associated with another compute kernel that will be suspended until after the execution of the at least one of the one or more compute kernels. 20. The method of claim 19, wherein the result associated with the at least one of the one or more compute kernels is an input to the another compute kernel.

이 특허에 인용된 특허 (46)

Wu, Gansha; Lueh, Guei Yuan; Shi, Xiaohua, Apparatus and methods for restoring synchronization to object-oriented software applications in managed runtime environments.
상세보기
Gyugyi, Paul J., Apparatus, system, and method for offloading packet classification.
상세보기
Gyugyi, Paul J.; Danilak, Radoslav, Apparatus, system, and method for offloading pattern matching scanning.
상세보기
Grunkemeyer,Brian M.; Hawkins,Jonathan C.; Brumme,Christopher W.; Kakivaya,Gopala Krishna R.; Olson,Lance E.; Robsman,Dmitry; Sanders, II,Henry L., Asynchronous pattern.
상세보기
Shaylor, Nicholas, Code generation for a bytecode compiler.
상세보기
Morshed, Farokh; Meagher, Robert, Collection of timing and coverage data through a debugging interface.
상세보기
Col, Gerard M.; Henry, G. Glenn; Hooker, Rodney E., Compare branch instruction pairing within a single integer pipeline.
상세보기
Radigan, James J., Compiling for multiple virtual machines targeting different processor architectures.
상세보기
Ihara Sigeo (Tokorozawa JPX) Tanaka Teruo (Hachioji JPX) Iwasawa Kyoko (Tokyo JPX) Hamanaka Naoki (Tokyo JPX), Compiling method for determining programs to be executed parallelly by respective processors in a parallel computer whic.
상세보기
Tang Jun ; So John Ling Wing, Computer operating process allocating tasks between first and second processors at run time based upon current processor load.
상세보기
Yoav Lavi IL; Amnon Rom IL; Robert Knuth DE; Rivka Blum IL; Meny Yanni IL; Haim Granot IL; Anat Hershko IL; Georgiy Shenderovitch IL; Elliot Cohen IL; Eran Weingatren IL, Configurable long instruction word architecture and instruction set.
상세보기
Orelind, Greger J.; Jaenicke, August A., Correlation-based information extraction from markup language documents.
상세보기
King Michael Roy,ATX, Data processing system having monitoring of software activity.
상세보기
Held James P., Debugger for debugging tasks in an operating system virtual device driver.
상세보기
Cary Lee Bates ; Jeffrey Michael Ryan, Debugger thread identification points.
상세보기
Creamer,Thomas E.; Hilf,Bill H.; Katz,Neil A.; Moore,Victor S., Debugging a grid environment using ghost agents.
상세보기
Ankireddipally, Lakshmi Narasimha; Yeh, Ryh-Wei; Nichols, Dan; Devesetti, Ravi, Distributed transaction processing system.
상세보기
Gebhart, Alexander, Grid compute node software application deployment.
상세보기
Alain Charles Azagury IL; Michael Factor IL; Gera Goft IL; Shlomit Pinter IL; Esther Yeger-Lotem IL, Group communication system with flexible member model.
상세보기
Miller, Robert; Morey, Vicki Lynn; Thayib, Kiswanto; Williams, Laurie Ann, Merge protocol for clustered computer system.
상세보기
Choi, Jong-Deok, Method and apparatus for deterministic replay of java multithreaded programs on multiprocessors.
상세보기
Ebcioglu Mahmut Kemal ; Groves Randall Dean, Method and apparatus for dynamic conversion of computer instructions.
상세보기
Burke, David, Method and apparatus for executing standard functions in a computer system using a field programmable gate array.
상세보기
Wolczko Mario I. ; Ungar David M., Method and apparatus for improving compiler performance during subsequent compilations of a source program.
상세보기
Kielstra,Allan Henry; Stepanian,Levon Sassoon; Stoodley,Kevin Alexander, Method and apparatus for transforming Java Native Interface function calls into simpler operations during just-in-time compilation.
상세보기
Maloney,Rian R., Method and system for online communication between a check sorter and a check processing system.
상세보기
Itou Yoshihiro,JPX ; Nakajima Kei,JPX, Method of reducing unnecessary barrier instructions.
상세보기
Rangachari,Achutha Raman, Methods and apparatus for executing instructions in parallel.
상세보기
Strom,Daniel J.; Zeliger,Ohad, Methods and apparatus providing remote operation of an application programming interface.
상세보기
Blaner Bartholomew (Underhill Center VT) Larsen Larry D. (Raleigh NC), Multiple condition code branching system in a multi-processor environment.
상세보기
Bell, Colin; Branscomb, Brian, Network device with a distributed switch fabric timing system.
상세보기
Gupta Rajiv ; Worley ; Jr. William S., Out-of-order execution using encoded dependencies between instructions in queues to determine stall values that control.
상세보기
Aguilar, Jr., Maximino; Nutter, Mark Richard; Stafford, James Michael, Processor dedicated code handling in a multi-processor environment.
상세보기
Souloglou,Jason; Rawsthorne,Alasdair, Program code conversion.
상세보기
Wright, Gregory M.; Wolczko, Mario I.; Seidl, Matthew L., Reducing the overhead involved in executing native code in a virtual machine through binary reoptimization.
상세보기
Barbour Anthony L. ; Blaho Bruce E., Runtime processor detection and installation of highly tuned processor specific routines.
상세보기
Hon Keat W. Chan ; Andrew J. Edwards ; Amitabh Srivastava ; Hoi H. Vo, Shared library optimization for heterogeneous programs.
상세보기
Spix George A. (Eau Claire WI) Wengelski Diane M. (Eau Claire WI) Hawkinson Stuart W. (Eau Claire WI) Johnson Mark D. (Eau Claire WI) Burke Jeremiah D. (Eau Claire WI) Thompson Keith J. (Eau Claire W, System and method for controlling a highly parallel multiprocessor using an anarchy based scheduler for parallel executi.
상세보기
Craig Chambers ; Susan J. Eggers ; Brian K. Grant ; Markus Mock ; Matthai Philipose, System and method for performing selective dynamic compilation using run-time information.
상세보기
Bordes, Jean Pierre; Davis, Curtis; Hegde, Manju, System with PPU/GPU architecture.
상세보기
Demetriou, Christopher G.; Papakipos, Matthew N.; Gibbs, Noah L., Systems and methods for debugging an application running on a parallel-processing computer system.
상세보기
Crutchfield, William Y.; Grant, Brian K.; Papakipos, Matthew N., Systems and methods for dynamically choosing a processing element for a compute kernel.
상세보기
Wilt, Nicholas P.; Miller, James, Systems and methods for managing drivers in a computing system.
상세보기
Wannamaker, Jeffrey; Scheyen, Peter G. N., Targeted runtime compilation.
상세보기
Ankireddipally, Lakshmi Narasimha; Yeh, Ryh-Wei; Nichols, Dan; Devesetti, Ravi, Transaction data structure for process communications among network-distributed applications.
상세보기
Pharies,Stefan H.; Srinivasan,Sowmy K.; Jethanandani,Natasha H.; Christensen,Yann Erik; Kharitidi,Elena A.; Purdy,Douglas M., Type bridges.
상세보기

이 특허를 인용한 특허 (6)

Erdmann, Bozena; Lelkens, Armand Michel Marie; Schreyer, Oliver, Compiler and compiling method for a networked control system comprising a plurality of devices.
상세보기
Aggag, Khalid; Narayanan, Suriya, Incrementally compiling software artifacts from an interactive development environment.
상세보기
Bond, Barry Clayton, Managing callback operations in emulated environments.
상세보기
Luszczek, Piotr R.; Little, John N.; Martin, Jocelyn Luke; Stefansson, Halldor N.; Ellis, Edric; Anderson, Penelope L.; Baker, Brett; Dean, Loren; Lurie, Roy E., Performing parallel processing of distributed arrays.
상세보기
Garrett, Charles D., Transmission point pattern extraction from executable code in message passing environments.
상세보기
Tian, Xinmin; Kozhukhov, Sergey Stanislavoich; Preis, Sergey Victorovich; Geva, Robert Yehuda; Pyjov, Konstantin Anatolyevich; Sato, Hideki; Girkar, Milind Baburao; Kasov, Aleksei Gurievich; Panchenko, Nikolay Vladimirovich, Vectorization of scalar functions including vectorization annotations and vectorized function signatures matching.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Runtime system for executing an application in a parallel-processing computer system 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (46)

이 특허를 인용한 특허 (6)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Runtime system for executing an application in a parallel-processing computer system 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (46)

이 특허를 인용한 특허 (6)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트