IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0082971
(2008-04-14)
|
등록번호 |
US-8438003
(2013-05-07)
|
발명자
/ 주소 |
- Agarwal, Rakesh
- Baltaretu, Oana
|
출원인 / 주소 |
- Cadence Design Systems, Inc.
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
0 인용 특허 :
58 |
초록
▼
A method of improved simulator processing is provided. The method according to the current invention includes grouping frequently accessed data into one set id to improve memory hierarchy performance. The method further includes simulating predication in a non-predicated architecture to improve CPU
A method of improved simulator processing is provided. The method according to the current invention includes grouping frequently accessed data into one set id to improve memory hierarchy performance. The method further includes simulating predication in a non-predicated architecture to improve CPU performance. The simulated predication includes pseudo-predicated implementation of read-operation vector element access pseudo-predicated implementation of write-operation vector element access, and predicated implementation of multi-way branches with assignment statements having a same left-hand-side (lhs). The method further includes determining a selection path in a multi-sensitive “always” block to reduce taken branches. The multi-sensitive “always” block selection path determination includes generating instance-specific code to save port allocation storage, and generating inlined instance-specific code to combine sensitive actions. The method further includes regenerating code affected by the assignment statement to implement value-change callback.
대표청구항
▼
1. A computer-implemented method of improving simulator processing, the method comprising:allocating data used by a simulation scheduler;simulating predication in a non-predicated architecture, wherein the simulated predication comprises: determination of a maximum pseudo-predicated instruction sequ
1. A computer-implemented method of improving simulator processing, the method comprising:allocating data used by a simulation scheduler;simulating predication in a non-predicated architecture, wherein the simulated predication comprises: determination of a maximum pseudo-predicated instruction sequence length by considering target machine microarchitecture characteristics;implementation of multi-valued read-operation and multi-valued write-operation vector element access, wherein any of the multi-value read-operation and the multi-valued write-operation can be expressed as 0/1/X/Z bits; andimplementation of multi-way branches with assignment statements having a same left-hand-side (lhs);determining a selection path in a multi-sensitive “always” block to reduce taken multi-way branches, andgenerating code;wherein allocating data used by a simulation scheduler further comprises: probing a line size of a processor cache;providing a software override of a value of the probed line size; andselecting one or more of a core routine algorithm and data structure for the simulation scheduler, wherein a sum of line sizes is not greater than a d1_linesize, wherein the d1_linesize is a line size of a level 1 data cache. 2. The computer-implemented method of claim 1, wherein a start address of the data structure is aligned at an address that is a multiple of the d1_linesize. 3. The computer-implemented method of claim 1, wherein a user specifies a set id of a class of central routines (S) as either a fixed value between a range of 0 and S−1 inclusive, or as a randomly chosen value in the range of 0 and S−1. 4. The computer-implemented method of claim 1, further comprising: applying programming constructs, wherein the programming constructs are unique to hardware description language (HDL). 5. The computer-implemented method of claim 1, wherein target machine microarchitecture characteristics are measured and the maximum pseudo-predicated instruction sequence length is determined, wherein a compiler-user-specified parameter can override the measured characteristics. 6. The computer-implemented method of claim 1, wherein a first phantom element at index −1 of each vector is introduced to conduct a pseudo-predicated evaluation of each vector. 7. The computer-implemented method of claim 1, wherein a second phantom element at index −2 of each vector is introduced and when the vector has X/Z bits the −2 index is a temporary storage location. 8. The computer-implemented method of claim 1, wherein assignment statements of the multi-way branch are converted to allow for the predication in a non-predicated architecture. 9. The computer-implemented method of claim 8, wherein assignment statements of the multi-way branch only having an “else” clause are converted to allow for the predication in a non-predicated architecture. 10. The computer-implemented method of claim 1, wherein code is inlined for each instance of a small module that directly encodes an actual parameter address. 11. The computer-implemented method of claim 10, wherein the module is viewed at compile time. 12. The computer-implemented method of claim 1, wherein if X/Z bits are present, a separate code area is branched for handling. 13. The computer-implemented method of claim 1, wherein condition checks are done only by mainline code, whereas code for statement bodies for each condition is stored in a separate code area. 14. The computer-implemented method of claim 13, wherein nesting of the separate code area is provided. 15. The computer-implemented method of claim 1, wherein an acc_vcl_add( ) command is executed when the generated code for an assignment is affected by a temporal call. 16. The computer-implemented method of claim 1, further comprising: assigning a unique id to each one of a format specifier. 17. The computer-implemented method of claim 16, wherein an I/O command only sends the format specifier id and data values to an I/O subsystem. 18. The computer-implemented method of claim 17, wherein the I/O subsystem runs on a separate processor/thread to offload a main simulation processor. 19. A system comprising: a memory; anda processor configured to:simulate predication in a non-predicated architecture, wherein the simulated predication comprises:determination of a maximum pseudo-predicated instruction sequence length by considering target machine microarchitecture characteristics;implementation of multi-valued read-operation and multi-valued write-operation vector element access, wherein any of the multi-valued read-operation and multi-valued write-operation can be expressed as 0/1/X/Z bits;implementation of multi-way branches with assignment statements having a same left-hand-side (lhs);and determining a selection path in a multi-sensitive “always” block to reduce taken multi-way branches,and generating code;wherein to allocate data used by a simulation scheduler further comprises to: probe a line size of a processor cache;provide a software override of a value of the probed line size; andselect one or more of a core routine algorithm and data structure for the simulation scheduler, wherein a sum of line sizes is not greater than a d1_linesize, wherein the d1_linesize is a line size of a level 1 data cache. 20. The system of claim 19, wherein the processor is further configured to: apply programming constructs, wherein the programming constructs are unique to hardware description language (HDL). 21. The system of claim 19, wherein a start address of the data structure is aligned at an address that is a multiple of the d1_linesize. 22. The system of claim 19, wherein a user specifies a set id of a class of central routines (S) as either a fixed value between a range of 0 and S−1 inclusive, or as a randomly chosen value in the range of 0 and S−1. 23. The system of claim 19, wherein target machine microarchitecture characteristics are measured and the maximum pseudo-predicated instruction sequence length is determined, wherein a compiler-user-specified parameter can override the measured characteristics. 24. A non-transitory computer readable storage medium containing program instructions for improving simulator processing, wherein execution of program instructions by one or more processors of a computer causes the one or more processors to carry out the steps of: simulating predication in a non-predicated architecture, wherein the simulated predication comprises: determination of a maximum pseudo-predicated instruction sequence length by considering target machine microarchitecture characteristics;implementation of multi-valued read-operation and multi-valued write-operation vector element access, wherein any of the multi-valued read-operation and multi-valued write-operation can be expressed as 0/1/X/Z bits;implementation of multi-way branches with assignment statements having a same left-hand-side (lhs);determining a selection path in a multi-sensitive “always” block to reduce taken multi-way branches, andgenerating code;wherein allocating data used by a simulation scheduler further comprises: probing a line size of a processor cache;providing a software override of a value of the probed line size; andselecting one or more of a core routine algorithm and data structure for the simulation scheduler, wherein a sum of line sizes is not greater than a d1_linesize, wherein the d1_linesize is a line size of a level 1 data cache. 25. The non-transitory computer readable storage medium of claim 24, further comprising: applying programming constructs, wherein the programming constructs are unique to hardware description language (HDL). 26. The non-transitory computer readable storage medium of claim 24, wherein a start address of the data structure is aligned at an address that is a multiple of the d1_linesize. 27. The non-transitory computer readable storage medium of claim 24, wherein a user specifies a set id of a class of central routines (S) as either a fixed value between a range of 0 and S−1 inclusive, or as a randomly chosen value in the range of 0 and S−1. 28. The non-transitory computer readable storage medium of claim 24, wherein target machine microarchitecture characteristics are measured and the maximum pseudo-predicated instruction sequence length is determined, wherein a compiler-user-specified parameter can override the measured characteristics.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.