IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0644502
(2006-12-22)
|
등록번호 |
US-7788464
(2010-09-20)
|
발명자
/ 주소 |
- Sheu, John Te-Jui
- Cohen, Ernest S.
- Hendel, Matthew D.
- Wang, Landy
- Vega, Rene Antonio
- Nanavati, Sharvil A.
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
7 인용 특허 :
27 |
초록
▼
Various operations are provided that improve the scalability of virtual TLBs in multi-processor virtual machines, and they include: implicitly locking SPTs using per-processor generation counters; waiting for pending fills on other virtual processors to complete before servicing a GVA invalidation u
Various operations are provided that improve the scalability of virtual TLBs in multi-processor virtual machines, and they include: implicitly locking SPTs using per-processor generation counters; waiting for pending fills on other virtual processors to complete before servicing a GVA invalidation using the counters; write-protecting or unmaping guest pages in a deferred two-stage process or reclaiming SPTs in a deferred two-stage process; periodically coalescing two SPTs that shadow the same GPT with the same attributes; sharing SPTs between two SASes only at a specified level in a SPTT; flushing the entire virtual TLB using a generation counter; allocating a SPT to GPT from a NUMA node on which the GPT resides; having an instance for each NUMA node on which a virtual machine runs; and, correctly handling the serializing instructions executed by a guest in a virtual machine with more than one virtual processor sharing the virtual TLB.
대표청구항
▼
What is claimed: 1. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machi
What is claimed: 1. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: maintaining walk generation counters for corresponding said plurality of virtual processors in said virtual machine environment, wherein said walk generation counters are configured to be incremented to a first set of values when said virtual processors start accessing shadow page tables associated with said at least one virtual TLB, and wherein said walk generation counters are configured to be incremented to a second set of values when said virtual processors have finished accessing said shadow page tables; and preventing the repurposing of said shadow page tables with a non-zero reference count at the time of or since the last transition between said first set of values and said second set of values for one or more of shadow page table generation counters, thereby effectively locking said shadow page tables implicitly via said shadow page table generation counters. 2. The method according to claim 1, in association with said shadow page tables, further comprising: accessing a shadow page table (a) only through a reference from a higher-level shadow page table or (b) through a reference from a virtual processor if said shadow page table is a top-level shadow page table, only when a walk generation counter associated with a virtual processor accessing the shadow page table is at said first set of values; locking exclusive said shadow page table when said shadow page table is unreferenced to prevent new references to said shadow page table from being created; taking a snapshot of said walk generation counters corresponding to said virtual processors after locking exclusive said shadow page table; and wherein said shadow page table is reclaimed after all said generation counters corresponding to said virtual processors have arrived at said second set of values as verified by said snapshot. 3. The method according to claim 2, further comprising: providing a first list and a second list, wherein said first list is configured to represent a list of unreferenced shadow page tables, and said second list is configured to represent a list of locked shadow page tables that are prevented from being referenced; pushing at least one shadow page table associated with said shadow page tables that becomes unreferenced onto said first list; moving said at least one shadow page table from said first list onto said second list after locking said at least one shadow page table to prevent new references; taking said snapshot of said walk generation counters after pushing said at least one shadow page table onto said second list; deferring the freeing of said at least one shadow page table on said second list until said virtual processors have incremented their walk generation counters past said second values according to said snapshot. 4. The method according to claim 3, further comprising triggering the insertion and removal of at least one shadow page table from said second list based on heuristics such as the number of free shadow page table and the rate of allocations of shadow page table. 5. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: using fill generation counters corresponding to said plurality of virtual processors, wherein said fill generation counters are configured to be incremented to a first set of values prior to starting a fill in said virtual TLB, and wherein said fill generation counters are configured to be incremented to a second set of values after said fill; and wherein an invalidation request is performed only after all fill generation counters corresponding to said plurality of virtual processors in said virtual machine environment have arrived at said second set of values. 6. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: defining a virtual TLB mapped state, a hardware TLB mapped state, and an un-mapped state for at least one guest page associated with said virtual TLB; transitioning from said virtual TLB mapped state to said hardware TLB mapped state upon invalidation of all shadow page table translations in said virtual TLB that point to said at least one guest page; initiating a hardware TLB flush on every physical processor that the virtual machine environment has used based on heuristics such as the number of guest pages in the hardware TLB mapped state; and transitioning from said hardware TLB mapped state to said un-mapped state, wherein any hardware TLB translations on all physical processors underlying said shadow page table translations are eliminated in batched manner. 7. The method according to claim 1, wherein said at least one operation further comprises: determining a NUMA node on which a guest page table to be shadowed resides; and allocating a page for a shadow page table from the memory of said NUMA node, wherein said shadow page table caches translations for said guest page table, and wherein said allocating increases the likelihood that said shadow page table is on the same NUMA node as a processor that is walking said shadow page table. 8. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: maintaining a stale generation counter for a shadow page table in said virtual TLB; incrementing said stale generation counter if said shadow page table becomes stale; write-protecting a non-terminal guest page table so that said shadow page table can be made not stale by removing stale entries; taking a snapshot of said stale generation counter for said shadow page table and any other shadow page table, while walking a tree of shadow page tables down to a terminal shadow page table to perform a fill at a first time; checking the most recent state of said stale generation counter and any other generation counter for each shadow page table along said walk of said tree of shadow page tables against said snapshot at a second time after the terminal guest page table entry had been read; and wherein if said checking yields an incremented stale generation counter for at least one non-terminal shadow page table, restarting the virtual TLB fill. 9. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: allocating and linking in a new shadow page table, for said virtual TLB, to shadow a guest page table instead of zeroing and linking in an existing shadow page table that already shadows said guest page table when performing a fill that requires linking in said existing shadow page table. 10. The method according to claim 1, wherein said at least one operation further comprises: coalescing a first shadow page table and a second shadow page table when said first shadow page table and said second shadow page table shadow a guest page table with substantially the same attributes, wherein said coalescing is performed according to a heuristic. 11. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: permitting only shadow page tables at a specific level of a shadow page table tree to be shared between at least two shadow address spaces; keeping a single back reference for a given shadow page table since said given shadow page table not at said specific level is not shared and has a reference count of at most one; and unlinking said shadow page table not at said specific level from its only parent by following said single back reference. 12. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; said at least one operation configured for flushing said virtual TLB using a generation counter, further comprising: maintaining a virtual TLB generation counter for a virtual machine; incrementing said virtual TLB generation counter to a first value prior to starting a reset of said virtual TLB associated with said virtual machine; forcing every virtual processor corresponding to said plurality of virtual processors in said virtual machine to switch to a new shadow address space to reset said virtual TLB; and incrementing said virtual TLB generation counter to a second value after completing said reset, and wherein said first and second values represent different generations of said virtual TLB, wherein said reset resides between said generations. 13. The method according to claim 12, further comprising: tagging a shadow page table in said virtual TLB upon allocation with a snapshot of said virtual TLB generation counter; tagging information on whether a guest page is mapped with said snapshot of said virtual TLB generation counter; and using only shadow page tables that belong to the current generation of said generations. 14. A system for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: a first subsystem comprising at least one virtual TLB; a second subsystem comprising at least two virtual processors from a plurality of virtual processors in a single virtual machine that share said at least one virtual TLB; a third subsystem that maintains walk generation counters for corresponding said plurality of virtual processors in said virtual machine environment, wherein said walk generation counters are configured to be incremented to a first set of values when said virtual processors start accessing shadow page tables associated with said at least one virtual TLB, and wherein said walk generation counters are configured to be incremented to a second set of values when said virtual processors have finished accessing said shadow page tables; and a fourth subsystem that prevents the repurposing of said shadow page tables with a non-zero reference count at the time of or since the last transition between said first set of values and said second set of values for one or more of shadow page table generation counters, thereby effectively locking said shadow page tables implicitly via said shadow page table generation counters. 15. The system according to claim 14, further comprising: a fifth subsystem that determines a NUMA node on which a guest page table to be shadowed resides; and a sixth subsystem that allocates a page for a shadow page table from the memory of said NUMA node, wherein said shadow page table caches translations for said guest page table, and wherein said allocating increases the likelihood that said shadow page table is on the same NUMA node as a processor that is walking said shadow page table. 16. A computer readable storage medium having stored thereon computer executable instructions for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: a first instruction that provides use of at least one virtual TLB; and a second instruction that provides the sharing of said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a single virtual machine environment, wherein said sharing involves performing at least one operation; a third instruction that maintains walk generation counters for corresponding said plurality of virtual processors in said virtual machine environment, wherein said walk generation counters are configured to be incremented to a first set of values when said virtual processors start accessing shadow page tables associated with said at least one virtual TLB, and wherein said walk generation counters are configured to be incremented to a second set of values when said virtual processors have finished accessing said shadow page tables; and a fourth instruction that prevents the repurposing of said shadow page tables with a non-zero reference count at the time of or since the last transition between said first set of values and said second set of values for one or more of shadow page table generation counters, thereby effectively locking said shadow page tables implicitly via said shadow page table generation counters.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.