Systems and methods for generating reference results using parallel-processing computer system
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-009/44
G06F-011/00
G06F-009/50
출원번호
US-0603361
(2012-09-04)
등록번호
US-8972943
(2015-03-03)
발명자
/ 주소
Papakipos, Matthew N.
Grant, Brian K.
Demetriou, Christopher G
출원인 / 주소
Google Inc.
대리인 / 주소
Morgan, Lewis & Bockius LLP
인용정보
피인용 횟수 :
4인용 특허 :
46
초록▼
A method for debugging an application includes obtaining first and second fusible operation requests; if there is a break point between the first and the second operation request, generating a first set of compute kernels including programs corresponding to the first operation request, but not to th
A method for debugging an application includes obtaining first and second fusible operation requests; if there is a break point between the first and the second operation request, generating a first set of compute kernels including programs corresponding to the first operation request, but not to the second operation request; and generating a second set of compute kernels including programs corresponding the second operation request, but not to the first operation request; if there is no break point between the first and the second operation request, generating a third set of compute kernels which include programs corresponding to a merge of the first and second operation requests; and arranging for execution of either the first and second, or the third set of compute kernels, further including debugging the first or second set of compute kernels when there is a break point set between the first and second operation requests.
대표청구항▼
1. A computer-implemented method, comprising: at a computer having memory and a plurality of processing elements: obtaining a plurality of operation requests from an application, the plurality of operation requests including a first operation request and a second operation request, wherein the first
1. A computer-implemented method, comprising: at a computer having memory and a plurality of processing elements: obtaining a plurality of operation requests from an application, the plurality of operation requests including a first operation request and a second operation request, wherein the first operation request and the second operation request are fusible;in accordance with a determination that there is a break point set between the first operation request and the second operation request, generating a first set of compute kernels, the first set of compute kernels including programs corresponding to the first operation request, but not to the second operation request;generating a second set of compute kernels, the second set of compute kernels including programs corresponding the second operation request, but not to the first operation request; andin accordance with a determination that there is no break point set between the first operation request and the second operation request, generating a third set of compute kernels, the third set of compute kernels including programs corresponding to a merge of the first and second operation requests; andarranging for execution of either the first and the second sets of compute kernels, or the third set of compute kernels, on the plurality of processing elements, further including debugging the first or the second set of compute kernels during the execution, when there is a break point set between the first operation request and the second operation request. 2. The method of claim 1, wherein the first set and the second set of compute kernels are executed on one or more central processing units, and the third set of compute kernels is executed on one or more graphical processing units. 3. The method of claim 1, wherein the break point set between the first operation request and the second operation request is embedded in the application. 4. The method of claim 1, wherein the break point set between the first operation request and the second operation request is set through a program debugging tool. 5. The method of claim 1, wherein the break point set between the first operation request and the second operation request is set by a user debugging the application. 6. The method of claim 1, wherein results from the execution of the first and the second sets of compute kernels are used for debugging results from the execution of the third set of compute kernels. 7. The method of claim 6, wherein debugging the results from the execution of the third set of compute kernels includes: comparing the results from the execution of the first and the second sets of compute kernels with the results from the execution of the third set of compute kernels. 8. A parallel-processing computer system, comprising: memory;one or more types of processing elements including a target processing element of a first type and a reference processing element of a second type; andat least one program stored in the memory and executed by the one or more types of processing elements, the at least one program including:instructions for obtaining a plurality of operation requests from an application, the plurality of operation requests including a first operation request and a second operation request, wherein the first operation request and the second operation request are fusible;instructions for: in accordance with a determination that there is a break point set between the first operation request and the second operation request, generating a first set of compute kernels, the first set of compute kernels including programs corresponding to the first operation request, but not to the second operation request;generating a second set of compute kernels, the second set of compute kernels including programs corresponding the second operation request, but not to the first operation request; andinstructions for: in accordance with a determination that there is no break point set between the first operation request and the second operation request, generating a third set of compute kernels, the third set of compute kernels including programs corresponding to a merge of the first and second operation requests; andinstructions for arranging for execution of either the first and the second sets of compute kernels, or the third set of compute kernels, on the plurality of processing elements, further including debugging the first or the second set of compute kernels during the execution, when there is a break point set between the first operation request and the second operation request. 9. The computer system of claim 8, wherein the first set and the second set of compute kernels are executed on one or more central processing units, and the third set of compute kernels is executed on one or more graphical processing units. 10. The computer system of claim 8, wherein the break point set between the first operation request and the second operation request is embedded in the application. 11. The computer system of claim 8, wherein the break point set between the first operation request and the second operation request is set through a program debugging tool. 12. The computer system of claim 8, wherein the break point set between the first operation request and the second operation request is set by a user debugging the application. 13. The computer system of claim 8, wherein results from the execution of the first and the second sets of compute kernels are used for debugging results from the execution of the third set of compute kernels. 14. The computer system of claim 13, wherein debugging the results from the execution of the third set of compute kernels includes: comparing the results from the execution of the first and the second sets of compute kernels with the results from the execution of the third set of compute kernels. 15. A computer program product for use in conjunction with a parallel-processing computer system, the computer program product comprising a non-transitory computer readable storage medium and a computer program mechanism embedded therein, the computer program mechanism comprising: instructions for obtaining a plurality of operation requests from an application, the plurality of operation requests including a first operation request and a second operation request, wherein the first operation request and the second operation request are fusible;instructions for: in accordance with a determination that there is a break point set between the first operation request and the second operation request, generating a first set of compute kernels, the first set of compute kernels including programs corresponding to the first operation request, but not to the second operation request;generating a second set of compute kernels, the second set of compute kernels including programs corresponding the second operation request, but not to the first operation request; andinstructions for: in accordance with a determination that there is no break point set between the first operation request and the second operation request,generating a third set of compute kernels, the third set of compute kernels including programs corresponding to a merge of the first and second operation requests; andinstructions for arranging for execution of either the first and the second sets of compute kernels, or the third set of compute kernels, on the plurality of processing elements, further including debugging the first or the second set of compute kernels during the execution, when there is a break point set between the first operation request and the second operation request. 16. The computer program product of claim 15, wherein the first set and the second set of compute kernels are executed on one or more central processing units, and the third set of compute kernels is executed on one or more graphical processing units. 17. The computer program product of claim 15, wherein the break point set between the first operation request and the second operation request is embedded in the application. 18. The computer program product of claim 15, wherein the break point between the first operation request and the second operation request is set through a program debugging tool. 19. The computer program product of claim 15, wherein the break point set between the first operation request and the second operation request is set by a user debugging the application. 20. The computer program product of claim 15, wherein results from the execution of the first and the second sets of compute kernels are used for debugging results from the execution of the third set of compute kernels. 21. The computer program product of claim 19, wherein debugging the results from the execution of the third set of compute kernels includes: comparing the results from the execution of the first and the second sets of compute kernels with the results from the execution of the third set of compute kernels.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (46)
Wu, Gansha; Lueh, Guei Yuan; Shi, Xiaohua, Apparatus and methods for restoring synchronization to object-oriented software applications in managed runtime environments.
Tang Jun ; So John Ling Wing, Computer operating process allocating tasks between first and second processors at run time based upon current processor load.
Alain Charles Azagury IL; Michael Factor IL; Gera Goft IL; Shlomit Pinter IL; Esther Yeger-Lotem IL, Group communication system with flexible member model.
Kielstra,Allan Henry; Stepanian,Levon Sassoon; Stoodley,Kevin Alexander, Method and apparatus for transforming Java Native Interface function calls into simpler operations during just-in-time compilation.
Gupta Rajiv ; Worley ; Jr. William S., Out-of-order execution using encoded dependencies between instructions in queues to determine stall values that control.
Wright, Gregory M.; Wolczko, Mario I.; Seidl, Matthew L., Reducing the overhead involved in executing native code in a virtual machine through binary reoptimization.
Spix George A. (Eau Claire WI) Wengelski Diane M. (Eau Claire WI) Hawkinson Stuart W. (Eau Claire WI) Johnson Mark D. (Eau Claire WI) Burke Jeremiah D. (Eau Claire WI) Thompson Keith J. (Eau Claire W, System and method for controlling a highly parallel multiprocessor using an anarchy based scheduler for parallel executi.
Craig Chambers ; Susan J. Eggers ; Brian K. Grant ; Markus Mock ; Matthai Philipose, System and method for performing selective dynamic compilation using run-time information.
Demetriou, Christopher G.; Papakipos, Matthew N.; Gibbs, Noah L., Systems and methods for debugging an application running on a parallel-processing computer system.
Crutchfield, William Y.; Grant, Brian K.; Papakipos, Matthew N., Systems and methods for dynamically choosing a processing element for a compute kernel.
Ankireddipally, Lakshmi Narasimha; Yeh, Ryh-Wei; Nichols, Dan; Devesetti, Ravi, Transaction data structure for process communications among network-distributed applications.
Kee, Hojin; Yi, Haoran; Ly, Tai A.; Petersen, Newton G.; Lewis, James M.; Blasig, Dustyn K.; Arnesen, Adam T.; Riche, Taylor L., Correlation analysis of program structures.
Kee, Hojin; Yi, Haoran; Ly, Tai A.; Petersen, Newton G.; Lewis, James M.; Blasig, Dustyn K.; Arnesen, Adam T.; Riche, Taylor L., Correlation analysis of program structures.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.