Systems and methods for debugging an application running on a parallel-processing computer system
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-009/44
G06F-011/00
출원번호
US-0237829
(2011-09-20)
등록번호
US-8429617
(2013-04-23)
발명자
/ 주소
Demetriou, Christopher G.
Papakipos, Matthew N.
Gibbs, Noah L.
출원인 / 주소
Google Inc.
대리인 / 주소
Morgan, Lewis & Bockius LLP
인용정보
피인용 횟수 :
36인용 특허 :
44
초록▼
A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-in
A runtime system implemented in accordance with the present invention provides an application platform for parallel-processing computer systems. Such a runtime system enables users to leverage the computational power of parallel-processing computer systems to accelerate/optimize numeric and array-intensive computations in their application programs. This enables greatly increased performance of high-performance computing (HPC) applications.
대표청구항▼
1. A computer-implemented method, comprising: at a computer having memory and a plurality of processing elements: receiving one or more operation requests issued by an application and directed to a parallel-processing computer system, wherein the one or more operation requests include a first operat
1. A computer-implemented method, comprising: at a computer having memory and a plurality of processing elements: receiving one or more operation requests issued by an application and directed to a parallel-processing computer system, wherein the one or more operation requests include a first operation request and a second operation request;generating a first set of compute kernels corresponding to a merge of the first and second operation requests; andgenerating a second set of compute kernels corresponding to the first operation request alone; andexecuting the first and second sets of compute kernels on the parallel-processing computer system, such that the first set of compute kernels can be debugged based on the execution of the second set of compute kernels. 2. The method of claim 1, wherein the one or more operation requests are associated with a user-initiated data examination request. 3. The method of claim 2, wherein the user-initiated data examination request is embedded in the application. 4. The method of claim 2, wherein the user-initiated data examination request is launched through a program debugging tool. 5. The method of claim 2, wherein the user-initiated data examination request is a user-specified breakpoint. 6. The method of claim 1, wherein results from the execution of the second set of compute kernels are used for debugging results from the execution of the first set of compute kernels. 7. The method of claim 1, wherein the second set of compute kernels is not a subset of the first set of compute kernels. 8. The method of claim 1, wherein the second set of compute kernels is a subset of the first set of compute kernels. 9. The method of claim 1, wherein the first set of compute kernels is executed on one or more graphical processing units and the second set of compute kernels is executed on one or more central processing units. 10. The method of claim 1, wherein the first set of compute kernels includes at least one pre-existing compute kernel that was previously generated in response to one or more earlier operation requests issued by the application. 11. A parallel-processing computer system, comprising: memory;a plurality of processing elements; andat least one program stored in the memory and executed by the plurality of processing elements, the at least one program including: instructions for receiving one or more operation requests issued by an application and directed to the parallel-processing computer system, wherein the one or more operation requests include a first operation request and a second operation request;instructions for generating a first set of compute kernels corresponding to a merge of the first and second operation requests; andinstructions for generating a second set of compute kernels corresponding to the first operation request alone; andinstructions for executing the first and second sets of compute kernels on the parallel-processing computer system, such that the first set of compute kernels can be debugged based on the execution of the second set of compute kernels. 12. The computer system of claim 11, wherein the one or more operation requests are associated with a user-initiated data examination request. 13. The computer system of claim 12, wherein the user-initiated data examination request is embedded in the application. 14. The computer system of claim 12, wherein the user-initiated data examination request is launched through a program debugging tool. 15. The computer system of claim 12, wherein the user-initiated data examination request is a user-specified breakpoint. 16. The computer system of claim 11, wherein results from the execution of the second set of compute kernels are used for debugging results from the execution of the first set of compute kernels. 17. The computer system of claim 11, wherein the second set of compute kernels is not a subset of the first set of compute kernels. 18. The computer system of claim 11, wherein the second set of compute kernels is a subset of the first set of compute kernels. 19. The computer system of claim 11, wherein the first set of compute kernels is executed on one or more graphical processing units and the second set of compute kernels is executed on one or more central processing units. 20. The computer system of claim 11, wherein the first set of compute kernels includes at least one pre-existing compute kernel that was previously generated in response to one or more earlier operation requests issued by the application. 21. A computer program product for use in conjunction with a parallel-processing computer system, the computer program product comprising a non-transitory computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising: instructions for receiving one or more operation requests issued by an application and directed to the parallel-processing computer system, wherein the one or more operation requests include a first operation request and a second operation request;instructions for generating a first set of compute kernels corresponding to a merge of the first and second operation requests; andinstructions for generating a second set of compute kernels corresponding to the first operation request alone; andinstructions for executing the first and the second sets of compute kernels on the parallel-processing computer system, such that the first set of compute kernels can be debugged based on the execution of the second set of compute kernels. 22. The computer program product of claim 21, wherein the one or more operation requests are associated with a user-initiated data examination request. 23. The computer program product of claim 22, wherein the user-initiated data examination request is embedded in the application. 24. The computer program product of claim 22, wherein the user-initiated data examination request is launched through a program debugging tool. 25. The computer program product of claim 22, wherein the user-initiated data examination request is a user-specified breakpoint. 26. The computer program product of claim 21, wherein results from the execution of the second set of compute kernels are used for debugging results from the execution of the first set of compute kernels. 27. The computer program product of claim 21, wherein the second set of compute kernels is not a subset of the first set of compute kernels. 28. The computer program product of claim 21, wherein the second set of compute kernels is a subset of the first set of compute kernels. 29. The computer program product of claim 21, wherein the first set of compute kernels is executed on one or more graphical processing units and the second set of compute kernels is executed on one or more central processing units. 30. The computer program product of claim 21, wherein the first set of compute kernels includes at least one pre-existing compute kernel that was previously generated in response to one or more earlier operation requests issued by the application.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (44)
Wu, Gansha; Lueh, Guei Yuan; Shi, Xiaohua, Apparatus and methods for restoring synchronization to object-oriented software applications in managed runtime environments.
Tang Jun ; So John Ling Wing, Computer operating process allocating tasks between first and second processors at run time based upon current processor load.
Kielstra,Allan Henry; Stepanian,Levon Sassoon; Stoodley,Kevin Alexander, Method and apparatus for transforming Java Native Interface function calls into simpler operations during just-in-time compilation.
Gupta Rajiv ; Worley ; Jr. William S., Out-of-order execution using encoded dependencies between instructions in queues to determine stall values that control.
Wright, Gregory M.; Wolczko, Mario I.; Seidl, Matthew L., Reducing the overhead involved in executing native code in a virtual machine through binary reoptimization.
Spix George A. (Eau Claire WI) Wengelski Diane M. (Eau Claire WI) Hawkinson Stuart W. (Eau Claire WI) Johnson Mark D. (Eau Claire WI) Burke Jeremiah D. (Eau Claire WI) Thompson Keith J. (Eau Claire W, System and method for controlling a highly parallel multiprocessor using an anarchy based scheduler for parallel executi.
Craig Chambers ; Susan J. Eggers ; Brian K. Grant ; Markus Mock ; Matthai Philipose, System and method for performing selective dynamic compilation using run-time information.
Demetriou, Christopher G.; Papakipos, Matthew N.; Gibbs, Noah L., Systems and methods for debugging an application running on a parallel-processing computer system.
Crutchfield, William Y.; Grant, Brian K.; Papakipos, Matthew N., Systems and methods for dynamically choosing a processing element for a compute kernel.
Ankireddipally, Lakshmi Narasimha; Yeh, Ryh-Wei; Nichols, Dan; Devesetti, Ravi, Transaction data structure for process communications among network-distributed applications.
Choi, Byoung Ju; Seo, Joo Young; Yang, Sueng Wan; Kim, Young Su; Oh, Jung Suk; Kwon, Hae Young; Jang, Seung Yeun, Communication test apparatus and method.
Stefansson, Halldor N.; Ellis, Edric, Saving and loading graphical processing unit (GPU) arrays providing high computational capabilities in a computing environment.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.