[특허]Method for load balancing an n-dimensional array of parallel processing elements

Method for load balancing an n-dimensional array of parallel processing elements 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-009/46 G06F-015/00 G06F-015/76
출원번호	US-0689365 (2003-10-20)
등록번호	US-7472392 (2008-12-30)
우선권정보	GB-0309204.6(2003-04-23)
발명자 / 주소	Beaumont,Mark
출원인 / 주소	Micron Technology, Inc.
대리인 / 주소	Jones Day
인용정보	피인용 횟수 : 0 인용 특허 : 22

초록 ▼

One aspect of the present invention relates to a method for balancing the load of an n-dimensional array of processing elements (PEs), wherein each dimension of the array includes the processing elements arranged in a plurality of lines and wherein each of the PEs has a local number of tasks associated therewith. The method comprises balancing at least one line of PEs in a first dimension, balancing at least one line of PEs in a next dimension, and repeating the balancing at least one line of PEs in a next dimension for each dimension of the n-dimensional array. The method may further comprise selecting one or more lines within said first dimension and shifting the number of tasks assigned to PEs in said selected one or more lines.

대표청구항 ▼

What is claimed is: 1. A method for balancing the work load of an n-dimensional array of processing elements(PEs), wherein each dimension of said array includes said processing elements arranged in a plurality of lines and wherein each of said processing elements has a local number of tasks associated therewith, the method comprising: balancing a work load across at least one line of processing elements in a first dimension by redistributing the tasks amongst the processing elements in said line; balancing a work load across at least one line of processing elements in a next dimension by redistributing the tasks amongst the processing elements in said line; and repeating said balancing at least one line of processing elements in a next dimension by redistributing the task among the processing elements in said line for each dimension of said n-dimensional array until the work load is balanced across all said processing elements; and wherein said balancing a work load comprises: calculating a total number of tasks for said line, wherein said total number of tasks for said line equals the sum of said local number of tasks for each of said processing elements on said line; notifying each of said processing elements on said line of said total number of tasks for said line; calculating a local mean number of tasks for each of said processing elements on said line; calculating a local deviation from local mean number for each of said processing elements on said line; determining a first local cumulative deviation for each of said processing elements on said line; determining a second local cumulative deviation for each of said processing elements on said line; and redistributing tasks among said processing elements on said line in response to at least one of said first local cumulative deviation and said second local cumulative deviation. 2. The method of claim 1 wherein two or more lines in at least one of said first dimension and said next dimension are balanced in parallel. 3. The method of claim 1 wherein said calculating a total number of tasks for said line comprises sequentially summing said local number of tasks for each of said processing elements on said line from a first end of said line to a second end of said line. 4. The method of claim 1 wherein said calculating said total number of tasks for said line includes solving the equation where V represents said total number of tasks for said line, N represents the number of processing elements on said line, and vi represents said local number of tasks for a local PEr on said line. 5. The method of claim 1 wherein said notifying step includes passing said total number of tasks from a second end of said line to a first end of said line. 6. The method of claim 1 wherein said calculating a local mean number of tasks includes solving the equation Mr=Trunc((V+Er)/N), where Mr represents said local mean for a local processing element PEr on said line, N represents the total number of PEs on said line, V is the total number of tasks, and Er is a number in the range of 0 to (N-1). 7. The method of claim 6 wherein each processing element has a different Er value. 8. The method of claim 6 wherein said Trunc function is responsive to Er such that said total number of tasks for said line is equal to the sum of the local mean number of tasks for each processing element on said line. 9. The method of claim 6 wherein said local mean Mr=Trunc((V+Er)/N) for each local PEr on said line is equal to either X or (X+1), where X is equal to local mean. 10. The method of claim 1 wherein said calculating a local deviation for each processing element on said line includes finding a difference between said local number of tasks for each PEr and said local mean number of tasks for each PEr. 11. The method of claim 1 wherein said determining a first local cumulative deviation includes sequentially summing said local deviations for each PEr from a first end of said line to an adjacent upstream PEr-1 on said line. 12. The method of claim 1 wherein said determining a second local cumulative deviation includes finding a difference between the negative of said local deviation for each PEr and said first local cumulative deviation for each PEr. 13. The method of claim 1 wherein said redistributing tasks among said processing elements on said line comprises: transferring a task from a local PEr to a left-adjacent PEr-1 if said first local cumulative deviation for said local PEr is a negative value; and transferring a task from said local PEr to a right-adjacent PEr+1 if said second local cumulative deviation for said local PEr is a negative value. 14. The method of claim 1 wherein said redistributing tasks among said processing elements on said line comprises: transferring a task from a local PEr to a left-adjacent PEr-1 if said second local cumulative deviation for said local PEr is a positive value; and transferring a task from said local PEr to a right-adjacent PEr+1 if said first local cumulative deviation for said local PEr is a positive value. 15. The method of claim 1 wherein said calculating a local mean number of tasks; said calculating a local deviation; said determining a first local cumulative deviation; said determining a second local cumulative deviation; and said redistributing tasks are completed in parallel for each processing element on said line. 16. The method of claim 15 wherein said calculating a local mean number of tasks; said calculating a local deviation; said determining a first local cumulative deviation; said determining a second local cumulative deviation; and said redistributing tasks are completed in parallel for each line in a selected dimension. 17. The method of claim 1 wherein said calculating a local deviation, said determining a first local cumulative deviation, said determining a second local cumulative deviation, and said redistributing tasks among said processing elements are repeated until said local deviation, said first local cumulative deviation, and said second local cumulative deviation for each of said processing elements is zero. 18. A method for balancing a work load across one dimension of an n-dimensional array of processing elements(PEs), wherein each of said n-dimensions is traversed by a plurality of lines and wherein each of said lines has a plurality of processing elements with a local number of tasks associated therewith, the method comprising: balancing said plurality of lines in one dimension by redistributing tasks amongst the processing elements in each of said plurality of lines; balancing said plurality of lines in a next higher dimension; repeating said balancing said plurality of lines in a next higher dimension for each remaining dimension of said n-dimensional array, wherein each of said balanced lines includes PEs with either a number of local tasks equal to X or a number of local tasks equal to (X+1), where X equals a local mean; substituting the value zero (0) for each processing element having X local number of tasks; substituting the value one (1) for each processing element having (X+1) local number of tasks; and shifting said values for each processing element within said balanced lines until a sum of said processing elements relative to a second dimension has only two different values, wherein shifting said values represents moving a task. 19. The method of claim 18 wherein said balancing said plurality of lines in one dimension comprises: calculating a total number of tasks present within at least one of said lines; notifying each processing element on said line of said total number of tasks for said line; determining each processing element's share of said total number of tasks on said line; calculating a local deviation from said previous steps; determining a first local cumulative deviation for each processing element on said line using said local deviation; determining a second local cumulative deviation for each processing element on said line using said local deviation; and redistributing tasks among each processing element on said line in response to at least one of said first local cumulative deviation and said second local cumulative deviation. 20. The method of claim 19 wherein said notifying each processing element comprises: serially summing said total number of tasks present on said line; and transmitting said total number of tasks to each processing element on said line. 21. The method of claim 19 wherein said determining each processing element's share of said total number of tasks comprises: calculating a local mean number of tasks for each processing element on said line; and calculating a local deviation from said local mean number of tasks for each processing element on said line by finding the difference between said local number of tasks and said local mean number of tasks for each processing element on said line. 22. The method of claim 21 wherein said calculating a local mean number of tasks for each processing element on said line comprises using a rounding function Mr=Trunc((V+Er)/N), where Mr represents said local mean of a local processing elements PEr, N represents the total number of processing elements on said line, V is the total number of tasks, and Er represents a number in the range of 0 to (N-1). 23. The method of claim 22 wherein said Trunc function is responsive to Er such that said total number of tasks for said line is equal to the sum of the local mean number of tasks for each of said processing elements in said line. 24. The method of claim 22 wherein said local mean Mr=Trunc((V+Er)/N) for each local processing element on said line is equal to either X or (X+1), where X is equal to a local mean. 25. The method of claim 19 wherein said determining a first local cumulative deviation for each processing element on said line includes summing said local deviations for each upstream processing element on said line. 26. The method of claim 19 wherein said determining a second local cumulative deviation for each processing element on said line includes finding the difference between the negative of said local deviation and said first local cumulative deviation for each processing element on said line. 27. The method of claim 19 wherein said redistributing tasks among each processing element on said line in response to at least one of said first local cumulative deviation and said second local cumulative deviation comprises: transferring a task from a first processing element on said line to a second processing element on said line if said first local cumulative deviation for said first processing element is a negative value; and transferring a task from said second processing element on said line to said first processing element on said line if said first local cumulative deviation for said second processing element is a positive value. 28. The method of claim 19 wherein said redistributing tasks among each processing element on said line in response to at least one of said first local cumulative deviation and said second local cumulative deviation comprises: transferring a task to a first processing element on said line from a second processing element on said line if said second local cumulative deviation for said first processing element is a negative value; and transferring a task to said second processing element on said line from said first processing element on said line if said second local cumulative deviation for said second processing element is a positive value. 29. The method of claim 21 wherein said calculating a local deviation, said determining a first local cumulative deviation, said determining a second local cumulative deviation, and said redistributing tasks among said processing elements are repeated until said local deviation, said first local cumulative deviation, and said second local cumulative deviation for each of said processing elements is zero. 30. A computer memory storing a set or instructions which, when executed, perform method for balancing a work load across one dimension of an n-dimensional array of processing elements(PEs), wherein each of said n-dimensions is traversed by a plurality of lines and where each of said lines has a plurality processing elements with a local number of tasks associated therewith, the method comprising: balancing said plurality of lines in one dimension by redistributing tasks amongst the processing elements in each of said plurality of lines; balancing said plurality of lines in a next higher dimension; repeating said balancing said plurality of lines in a next higher dimension for each remaining dimension of said n-dimensional array, wherein each of said balanced lines includes PEs with either a number of local tasks equal to X or a number of local tasks equal to (X+1), where X equals a local mean; substituting the value zero (0) for each processing element having X local number of tasks; substituting the value one (1) for each processing element having (X+1) local number of tasks; and shifting said values for each processing element within said balanced lines until a sum of said processing elements relative to a second dimension has only two different values, wherein shifting said values represents moving a task.

이 특허에 인용된 특허 (22)

Tomowaki Takahashi JP, Dual-imaging optical system.
상세보기
Hardwick Jonathan C.,GBX, Dynamic load balancing among processors in a parallel computer.
상세보기
Wheat Stephen R. (Albuquerque NM), Dynamic load balancing of applications.
상세보기
Rich Henry H., Linear expression evaluator.
상세보기
Hartung Michael H. (Tucson AZ) Nolta Arthur H. (Tucson AZ) Reed David G. (Tucson AZ) Tayler Gerald E. (Tucson AZ), Load balancing in a multiunit system.
상세보기
Glover Michael A. (10 Hemlock Way Durham NH 03824), Massively parallel SIMD processor which selectively transfers individual contiguously disposed serial memory elements.
상세보기
Elliott Duncan G.,CAX ; Snelgrove W. Martin,CAX, Memory device with multiple processors having parallel access to the same memory area.
상세보기
Pechanek Gerald G. ; Revilla Juan G., Merged array controller and processing element.
상세보기
David Karger ; Eric Lehman ; F. Thomson Leighton ; Matthew Levine ; Daniel Lewin ; Rina Panagrahy, Method and apparatus for distributing requests among a plurality of resources.
상세보기
Vignes Jean P. (Rueil-Malmaison FRX) Ung Vincent (La Varenne FRX), Method and apparatus of providing a result of a numerical calculation with the number of exact significant figures.
상세보기
Kawase, Kei; Moriyama, Takao; Nakamura, Fusashi, Method for dynamically changing load balance and computer.
상세보기
Eilert, Catherine K.; Kubala, Jeffrey P.; Nick, Jeffrey M.; Yocom, Peter B., Method, system and program products for managing central processing unit resources of a computing environment.
상세보기
Harrison R. Loyd (Fullerton CA) Davies Steven P. (Ontario CA), Modular array processor architecture having a plurality of interconnected load-balanced parallel processing nodes.
상세보기
Naganuma Jiro (Zama JPX) Ogura Takeshi (Chigasaki JPX), Multiprocessor system and a method of load balancing thereof.
상세보기
Hinsley Christopher Andrew,GBX, Operating system for use with computer networks incorporating two or more data processors linked together for parallel processing and incorporating improved dynamic load-sharing techniques.
상세보기
Kenichi Maeda JP; Nobuyuki Takeda JP; Yasukazu Okamoto JP, Parallel computer with improved access to adjacent processor and memory elements.
상세보기
Bahr James E. (Rochester MN) Corrigan Michael J. (Rochester MN) Knipfer Diane L. (Rochester MN) McMahon Lynn A. (Rochester MN) Metzger Charlotte B. (Elgin MN), Process for dispatching tasks among multiple information processors.
상세보기
Taylor James L. (Eastleigh GBX), SIMD array processor with global instruction control and reprogrammable instruction decoders.
상세보기
Jonathan Coulombe JP; Seiichiro Iwase JP, SIMD control parallel processor with simplified configuration.
상세보기
Wilkinson Paul Amba ; Dieffenderfer James Warren ; Kogge Peter Michael ; Schoonover Nicholas Jerome, SIMD/MIMD array processor with vector processing.
상세보기
Rich Henry H., Shared access texturing of computer graphic images.
상세보기
Matsuoka Hidetoshi (Kawasaki JPX) Hirose Fumiyasu (Kawasaki JPX), Uniform load distributing method for use in executing parallel processing in parallel computer.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Method for load balancing an n-dimensional array of parallel processing elements 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (22)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Method for load balancing an n-dimensional array of parallel processing elements 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (22)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트