[특허]Predication in a vector processor

Predication in a vector processor 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-009/30 G06F-009/38 G06F-015/80 G11C-007/10 G11C-008/12
출원번호	US-0569349 (2012-08-08)
등록번호	US-9575756 (2017-02-21)
발명자 / 주소	Fleischer, Bruce M. Fox, Thomas W. Jacobson, Hans M. Nair, Ravi
출원인 / 주소	INTERNATIONAL BUSINESS MACHINES CORPORATION
대리인 / 주소	Cantor Colburn LLP
인용정보	피인용 횟수 : 0 인용 특허 : 43

초록 ▼

Embodiments relate to vector processor predication in an active memory device. An aspect includes a system for vector processor predication in an active memory device. The system includes memory in the active memory device and a processing element in the active memory device. The processing element is configured to perform a method including decoding an instruction with a plurality of sub-instructions to execute in parallel. One or more mask bits are accessed from a vector mask register in the processing element. The one or more mask bits are applied by the processing element to predicate operation of a unit in the processing element associated with at least one of the sub-instructions.

대표청구항 ▼

1. A system for vector processor predication in an active memory device, the system comprising: memory in the active memory device; anda processing element in the active memory device, the processing element comprising a vector mask register, an arithmetic logic unit, and a load store unit, the processing element configured to perform a method comprising: setting one or more mask bits in the vector mask register in the processing element;applying the one or more mask bits by the processing element to predicate operation of the arithmetic logic unit or the load-store unit in the processing element associated with at least one of a plurality of sub-instructions;performing a compare of operands in the processing element using predication of a compare instruction to perform less than a maximum supported number of comparisons in parallel based on the one or more mask bits;storing compare results of the compare instruction as mask bit values of the vector mask register;analyzing a compare instruction syntax bit of the compare instruction to select between performing an OR-reduction and an AND-reduction on the mask bit values stored in response to performing less than the maximum supported number of comparisons in parallel by the predication of the compare instruction;reducing the mask bit values to a summary condition by performing a logical OR combination of the compare results based on determining that the OR-reduction is selected by the compare instruction syntax bit;reducing the mask bit values to the summary condition by performing a logical AND combination of the compare results based on determining that the AND-reduction is selected by the compare instruction syntax bit;writing the summary condition to a condition register; andusing the summary condition of the condition register to determine a branch direction of a conditional branch instruction in the processing element. 2. The system of claim 1, wherein applying the one or more mask bits by the processing element to predicate operation further comprises blocking one or more of: execution of at least one element of the sub-instructions and execution of at least one execution slot operating on a sub-element of at least one of the sub-instructions. 3. The system of claim 1, wherein applying the one or more mask bits by the processing element to predicate operation further comprises blocking one or more of: a memory access sub-instruction and part of an arithmetic operation. 4. The system of claim 1, wherein the processing element is further configured to perform: performing one or more of clock gating and data gating to one or more of: the arithmetic logic unit, the load-store unit, a vector computation register file, and a scalar computation register file based on the one or more mask bits. 5. The system of claim 1, wherein the processing element is further configured to perform: populating mask bit values of the vector mask register from one or more of: the memory and the arithmetic logic unit; andperforming logical operations by the processing element on the mask bit values to modify the mask bit values of the vector mask register. 6. The system of claim 1, wherein performing the logical OR combination of the compare results further comprises including a current value of the condition register in the logical OR combination of the compare results, and performing the logical AND combination of the compare results further comprises including the current value of the condition register in the logical AND combination of the compare results. 7. A system for vector processor predication in an active memory device, the system comprising: memory in the active memory device, wherein the active memory device is a three-dimensional memory cube and the memory is divided into three-dimensional blocked regions as memory vaults; anda processing element in the active memory device, the processing element comprising a vector mask register, an arithmetic logic unit, and a load store unit, the processing element configured to perform a method comprising: fetching, in the processing element, an instruction from an instruction buffer in the processing element;decoding, in the processing element, the instruction comprising a plurality of sub-instructions to execute in parallel;setting one or more mask bits in the vector mask register in the processing element;applying the one or more mask bits by the processing element to predicate operation of the arithmetic logic unit or the load-store unit in the processing element associated with at least one of the sub-instructions;performing a compare of operands in the processing element using predication of a compare instruction to perform less than a maximum supported number of comparisons in parallel based on the one or more mask bits;storing compare results of the compare instruction as mask bit values of the vector mask register;analyzing a compare instruction syntax bit of the compare instruction to select between performing an OR-reduction and an AND-reduction on the mask bit values stored in response to performing less than the maximum supported number of comparisons in parallel by the predication of the compare instruction;reducing the mask bit values to a summary condition by performing a logical OR combination of the compare results based on determining that the OR-reduction is selected by the compare instruction syntax bit;reducing the mask bit values to the summary condition by performing a logical AND combination of the compare results based on determining that the AND-reduction is selected by the compare instruction syntax bit;writing the summary condition to a condition register;using the summary condition of the condition register to determine a branch direction of a conditional branch instruction in the processing element; andaccessing the memory through one or more memory controllers in the active memory device for data operated upon by the instruction. 8. The system of claim 7, wherein applying the one or more mask bits by the processing element to predicate operation further comprises blocking one or more of: execution of at least one element of the sub-instructions and execution of at least one execution slot operating on a sub-element of at least one of the sub-instructions. 9. The system of claim 7, wherein applying the one or more mask bits by the processing element to predicate operation further comprises blocking one or more of: a memory access sub-instruction to prevent an access of the memory, and part of an arithmetic operation. 10. The system of claim 7, wherein the vector mask register is comprised of a plurality of vector mask entries, each comprising a plurality of elements of the mask bits, forming two-dimensional vector masks in the vector mask register, and further comprising: generating multiple mask bits per cycle per element based on single instruction, multiple data-in-space compare operations to form the two-dimensional vector masks in the vector mask register; andusing the two-dimensional vector masks with two-dimensional vector data, the two-dimensional vector masks corresponding to data sub-elements in the two-dimensional vector data to predicate. 11. The system of claim 7, wherein the processing element is further configured to perform: performing one or more of clock gating and data gating to one or more of: the arithmetic logic unit, the load-store unit, a vector computation register file, and a scalar computation register file based on the one or more mask bits. 12. The system of claim 7, wherein the processing element is further configured to perform: populating mask bit values of the vector mask register from one or more of: the memory and the arithmetic logic unit; andperforming logical operations by the processing element on the mask bit values to modify the mask bit values of the vector mask register. 13. The system of claim 7, wherein performing the logical OR combination of the compare results further comprises including a current value of the condition register in the logical OR combination of the compare results, and performing the logical AND combination of the compare results further comprises including the current value of the condition register in the logical AND combination of the compare results.

이 특허에 인용된 특허 (43)

Cutler David N. (Bellevue WA) Orbits David A. (Redmond WA) Bhandarkar Dileep (Shrewsbury MA) Cardoza Wayne (Merrimack NH) Witek Richard T. (Littleton MA), Apparatus and method for recovering from missing page faults in vector data processing operations.
상세보기
Gostin Gary B. ; Barr Matthew F. ; McGuffey Ruth A. ; Roan Russell L., Apparatus, systems and method for improving memory bandwidth utilization in vector processing systems.
상세보기
Sandorfi,Miklos, Central processing unit.
상세보기
Clark, Lawrence T.; Patterson, Dan W., Circuits and methods for processors with multiple redundancy techniques for mitigating radiation errors.
상세보기
Arya Siamak, Conditional vector processing.
상세보기
Fleck Rod G. ; Mattela Venkat ; Chesters Eric ; Afsar Muhammad, Data processing device with loop pipeline.
상세보기
Morton Steven G., Digital signal processor containing scalar processor and a plurality of vector processors operating from a single instruction.
상세보기
Mimar, Tibet, Efficient handling of vector high-level language conditional constructs in a SIMD processor.
상세보기
Papworth David B. ; Hinton Glenn J. ; Fetterman Michael A. ; Colwell Robert P. ; Glew Andrew F., Exception handling in a processor that performs speculative out-of-order instruction execution.
상세보기
Raikin, Shlomo; Valentine, Robert, Gather cache architecture.
상세보기
Ferren, Bran; Hillis, W. Daniel; Mangione-Smith, William Henry; Myhrvold, Nathan P.; Tegreene, Clarence T; Wood, Jr., Lowell L., Hardware-error tolerant computing.
상세보기
Ferren,Bran; Hillis,W. Daniel; Mangione Smith,William Henry; Myhrvold,Nathan P.; Tegreene,Clarence T.; Wood, Jr.,Lowell L., Hardware-error tolerant computing.
상세보기
Mukherjee,Shubhendu S.; Reinhardt,Steven K.; Emer,Joel S., Incremental checkpointing in a multi-threaded architecture.
상세보기
Fujii Hiroaki (Kokubunji CA JPX) Hamanaka Naoki (Palo Alto CA) Tanaka Teruo (Hachoiji JPX) Inagami Yasuhiro (Kodaira JPX) Tamaki Yoshiko (Kodaira JPX), Information processing apparatus having a register file used interchangeably both as scalar registers of register window.
상세보기
Ichimura Katsuhiko,JPX ; Nakata Takeshi,JPX ; Fukutome Goro,JPX, Information processing device and method for sequence control and data processing.
상세보기
Haigh Stephen G. (Redwood City CA) Baji Toru (Burlingame CA), Instruction preprocessor for conditionally combining short memory instructions into virtual long instructions.
상세보기
Scheuerlein, Roy E., Integrated circuit incorporating dual organization memory array.
상세보기
Thayer John S. ; Favor John G. ; Weber Frederick D., Load and store instructions which perform unpacking and packing of data bits in separate vector and integer cache storage.
상세보기
Luick, David Arnold; Mejdrich, Eric Oliver; Muff, Adam James, Load misaligned vector with permute and mask insert.
상세보기
Liao, Yu-Chung C.; Sandon, Peter A.; Cheng, Howard; Van Hook, Timothy J., Method and apparatus for obtaining a scalar value directly from a vector register.
상세보기
O'Connor, James Michael; Tremblay, Marc, Method frame storage using multiple memory circuits.
상세보기
Thomas L. Drabenstott ; Gerald G. Pechanek ; Edwin F. Barry ; Charles W. Kurak, Jr., Methods and apparatus to support conditional execution in a VLIW-based array processor with subword execution.
상세보기
Anderson, Timothy D.; Hoyle, David; Steiss, Donald E.; Krueger, Steven D., Microprocessor with non-aligned memory access.
상세보기
Gschwind, Michael K.; Olsson, Brett, Multi-addressable register file.
상세보기
Cho Seongrai ; Park Heonchul ; Song Seungyoon Peter, Multifunction data aligner in wide data width processor.
상세보기
Clery ; III William B., Multiple thread multiple data predictive coded parallel processing system and method.
상세보기
Dorojevets,Mikhail; Ogura,Eiji, Parallel vector processing.
상세보기
Reinhardt,Steven K.; Mukherjee,Shubhendu S.; Emer,Joel S., Periodic checkpointing in a redundantly multi-threaded architecture.
상세보기
Gschwind, Michael Karl; Hofstee, Harm Peter; Hopkins, Martin Edward, SIMD datapath coupled to scalar/vector/address/conditional data register file with selective subpath scalar processing mode.
상세보기
Brodnax Timothy B. (Austin TX) Bialas ; Jr. John S. (Bealeton VA) King Steven A. (Herndon VA) LeBlanc Johnny J. (Austin TX) Rickard Dale A. (Manassas VA) Spencer Clark J. (Praha CSX) Stanley Daniel L, Shadow register file for instruction rollback.
상세보기
Gower,Kevin C.; Kellogg,Mark W.; Maule,Warren E.; Smith, III,Thomas B.; Tremaine,Robert B., System, method and storage medium for providing data caching and data compression in a memory subsystem.
상세보기
Tremaine, Robert B., Systems and methods for providing data modification operations in memory subsystems.
상세보기
Gower, Kevin C.; Maule, Warren E.; Tremaine, Robert B., Systems and methods for providing distributed technology independent memory controllers.
상세보기
Zumkehr, John F.; Abouelnaga, Amir A., Systems and methods for use in reduced instruction set computer processors for retrying execution of instructions resulting in errors.
상세보기
Sandon, Peter A.; West, R. Michael P., Two dimensional addressing of a matrix-vector register array.
상세보기
Green Thomas S., Using three-dimensional storage to make variable-length instructions appear uniform in two dimensions.
상세보기
Hui, Ronald Chi-Chun, Vector processing with high execution throughput.
상세보기
Beard Douglas R. (Eleva WI) Phelps Andrew E. (Eau Claire WI) Woodmansee Michael A. (Eau Claire WI) Blewett Richard G. (Altoona WI) Lohman Jeffrey A. (Eau Claire WI) Silbey Alexander A. (Eau Claire WI, Vector processor having registers for control by vector resisters.
상세보기
Kashiyama Masamori (Hadano JPX) Ishii Koichi (Hadano JPX) Kawabe Shun (Machida JPX) Usami Masami (Ome JPX), Vector processor performing data operations in one half of a total time period of write operation and the read operation.
상세보기
Elwood Matthew Paul ; Hinds Christopher Neal, Vector register addressing.
상세보기
Glossner, III,Clair John; Hokenek,Erdem; Meltzer,David; Moudgill,Mayan, Vector register file with arbitrary vector addressing.
상세보기
Fossum Tryggve (Northboro MA) Manley Dwight P. (Holliston MA) McKeen Francis X. (Westboro MA) Tehranian Michael M. (Boxboro MA), Vector register system for executing plural read/write commands concurrently and independently routing data to plural re.
상세보기
Oberlin Steven M. ; Fromm Eric C. ; Passint Randal S., Virtual to logical to physical address translation for distributed memory massively parallel processing systems.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Predication in a vector processor 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (43)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Predication in a vector processor 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (43)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트