[특허]Scalability of virtual TLBs for multi-processor virtual machines

Scalability of virtual TLBs for multi-processor virtual machines 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-012/10
출원번호	UP-0644502 (2006-12-22)
등록번호	US-7788464 (2010-09-20)
발명자 / 주소	Sheu, John Te-Jui Cohen, Ernest S. Hendel, Matthew D. Wang, Landy Vega, Rene Antonio Nanavati, Sharvil A.
출원인 / 주소	Microsoft Corporation
대리인 / 주소	Woodcock Washburn LLP
인용정보	피인용 횟수 : 7 인용 특허 : 27

초록 ▼

Various operations are provided that improve the scalability of virtual TLBs in multi-processor virtual machines, and they include: implicitly locking SPTs using per-processor generation counters; waiting for pending fills on other virtual processors to complete before servicing a GVA invalidation using the counters; write-protecting or unmaping guest pages in a deferred two-stage process or reclaiming SPTs in a deferred two-stage process; periodically coalescing two SPTs that shadow the same GPT with the same attributes; sharing SPTs between two SASes only at a specified level in a SPTT; flushing the entire virtual TLB using a generation counter; allocating a SPT to GPT from a NUMA node on which the GPT resides; having an instance for each NUMA node on which a virtual machine runs; and, correctly handling the serializing instructions executed by a guest in a virtual machine with more than one virtual processor sharing the virtual TLB.

대표청구항 ▼

What is claimed: 1. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: maintaining walk generation counters for corresponding said plurality of virtual processors in said virtual machine environment, wherein said walk generation counters are configured to be incremented to a first set of values when said virtual processors start accessing shadow page tables associated with said at least one virtual TLB, and wherein said walk generation counters are configured to be incremented to a second set of values when said virtual processors have finished accessing said shadow page tables; and preventing the repurposing of said shadow page tables with a non-zero reference count at the time of or since the last transition between said first set of values and said second set of values for one or more of shadow page table generation counters, thereby effectively locking said shadow page tables implicitly via said shadow page table generation counters. 2. The method according to claim 1, in association with said shadow page tables, further comprising: accessing a shadow page table (a) only through a reference from a higher-level shadow page table or (b) through a reference from a virtual processor if said shadow page table is a top-level shadow page table, only when a walk generation counter associated with a virtual processor accessing the shadow page table is at said first set of values; locking exclusive said shadow page table when said shadow page table is unreferenced to prevent new references to said shadow page table from being created; taking a snapshot of said walk generation counters corresponding to said virtual processors after locking exclusive said shadow page table; and wherein said shadow page table is reclaimed after all said generation counters corresponding to said virtual processors have arrived at said second set of values as verified by said snapshot. 3. The method according to claim 2, further comprising: providing a first list and a second list, wherein said first list is configured to represent a list of unreferenced shadow page tables, and said second list is configured to represent a list of locked shadow page tables that are prevented from being referenced; pushing at least one shadow page table associated with said shadow page tables that becomes unreferenced onto said first list; moving said at least one shadow page table from said first list onto said second list after locking said at least one shadow page table to prevent new references; taking said snapshot of said walk generation counters after pushing said at least one shadow page table onto said second list; deferring the freeing of said at least one shadow page table on said second list until said virtual processors have incremented their walk generation counters past said second values according to said snapshot. 4. The method according to claim 3, further comprising triggering the insertion and removal of at least one shadow page table from said second list based on heuristics such as the number of free shadow page table and the rate of allocations of shadow page table. 5. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: using fill generation counters corresponding to said plurality of virtual processors, wherein said fill generation counters are configured to be incremented to a first set of values prior to starting a fill in said virtual TLB, and wherein said fill generation counters are configured to be incremented to a second set of values after said fill; and wherein an invalidation request is performed only after all fill generation counters corresponding to said plurality of virtual processors in said virtual machine environment have arrived at said second set of values. 6. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: defining a virtual TLB mapped state, a hardware TLB mapped state, and an un-mapped state for at least one guest page associated with said virtual TLB; transitioning from said virtual TLB mapped state to said hardware TLB mapped state upon invalidation of all shadow page table translations in said virtual TLB that point to said at least one guest page; initiating a hardware TLB flush on every physical processor that the virtual machine environment has used based on heuristics such as the number of guest pages in the hardware TLB mapped state; and transitioning from said hardware TLB mapped state to said un-mapped state, wherein any hardware TLB translations on all physical processors underlying said shadow page table translations are eliminated in batched manner. 7. The method according to claim 1, wherein said at least one operation further comprises: determining a NUMA node on which a guest page table to be shadowed resides; and allocating a page for a shadow page table from the memory of said NUMA node, wherein said shadow page table caches translations for said guest page table, and wherein said allocating increases the likelihood that said shadow page table is on the same NUMA node as a processor that is walking said shadow page table. 8. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: maintaining a stale generation counter for a shadow page table in said virtual TLB; incrementing said stale generation counter if said shadow page table becomes stale; write-protecting a non-terminal guest page table so that said shadow page table can be made not stale by removing stale entries; taking a snapshot of said stale generation counter for said shadow page table and any other shadow page table, while walking a tree of shadow page tables down to a terminal shadow page table to perform a fill at a first time; checking the most recent state of said stale generation counter and any other generation counter for each shadow page table along said walk of said tree of shadow page tables against said snapshot at a second time after the terminal guest page table entry had been read; and wherein if said checking yields an incremented stale generation counter for at least one non-terminal shadow page table, restarting the virtual TLB fill. 9. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: allocating and linking in a new shadow page table, for said virtual TLB, to shadow a guest page table instead of zeroing and linking in an existing shadow page table that already shadows said guest page table when performing a fill that requires linking in said existing shadow page table. 10. The method according to claim 1, wherein said at least one operation further comprises: coalescing a first shadow page table and a second shadow page table when said first shadow page table and said second shadow page table shadow a guest page table with substantially the same attributes, wherein said coalescing is performed according to a heuristic. 11. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; wherein said at least one operation comprises: permitting only shadow page tables at a specific level of a shadow page table tree to be shared between at least two shadow address spaces; keeping a single back reference for a given shadow page table since said given shadow page table not at said specific level is not shared and has a reference count of at most one; and unlinking said shadow page table not at said specific level from its only parent by following said single back reference. 12. A method for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: providing at least one virtual TLB; and sharing said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a virtual machine environment, wherein said sharing involves performing at least one operation; said at least two virtual processors sharing memory in a non-uniform memory architecture (NUMA) node; allocating memory for said at least one virtual TLB from the NUMA node; and forwarding guest virtual address invalidation requests to at least one of said virtual TLBs based on a heuristic; said at least one operation configured for flushing said virtual TLB using a generation counter, further comprising: maintaining a virtual TLB generation counter for a virtual machine; incrementing said virtual TLB generation counter to a first value prior to starting a reset of said virtual TLB associated with said virtual machine; forcing every virtual processor corresponding to said plurality of virtual processors in said virtual machine to switch to a new shadow address space to reset said virtual TLB; and incrementing said virtual TLB generation counter to a second value after completing said reset, and wherein said first and second values represent different generations of said virtual TLB, wherein said reset resides between said generations. 13. The method according to claim 12, further comprising: tagging a shadow page table in said virtual TLB upon allocation with a snapshot of said virtual TLB generation counter; tagging information on whether a guest page is mapped with said snapshot of said virtual TLB generation counter; and using only shadow page tables that belong to the current generation of said generations. 14. A system for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: a first subsystem comprising at least one virtual TLB; a second subsystem comprising at least two virtual processors from a plurality of virtual processors in a single virtual machine that share said at least one virtual TLB; a third subsystem that maintains walk generation counters for corresponding said plurality of virtual processors in said virtual machine environment, wherein said walk generation counters are configured to be incremented to a first set of values when said virtual processors start accessing shadow page tables associated with said at least one virtual TLB, and wherein said walk generation counters are configured to be incremented to a second set of values when said virtual processors have finished accessing said shadow page tables; and a fourth subsystem that prevents the repurposing of said shadow page tables with a non-zero reference count at the time of or since the last transition between said first set of values and said second set of values for one or more of shadow page table generation counters, thereby effectively locking said shadow page tables implicitly via said shadow page table generation counters. 15. The system according to claim 14, further comprising: a fifth subsystem that determines a NUMA node on which a guest page table to be shadowed resides; and a sixth subsystem that allocates a page for a shadow page table from the memory of said NUMA node, wherein said shadow page table caches translations for said guest page table, and wherein said allocating increases the likelihood that said shadow page table is on the same NUMA node as a processor that is walking said shadow page table. 16. A computer readable storage medium having stored thereon computer executable instructions for improving the scalability of virtual TLBs in multi-processor virtual machines, comprising: a first instruction that provides use of at least one virtual TLB; and a second instruction that provides the sharing of said at least one virtual TLB between at least two virtual processors from a plurality of virtual processors in a single virtual machine environment, wherein said sharing involves performing at least one operation; a third instruction that maintains walk generation counters for corresponding said plurality of virtual processors in said virtual machine environment, wherein said walk generation counters are configured to be incremented to a first set of values when said virtual processors start accessing shadow page tables associated with said at least one virtual TLB, and wherein said walk generation counters are configured to be incremented to a second set of values when said virtual processors have finished accessing said shadow page tables; and a fourth instruction that prevents the repurposing of said shadow page tables with a non-zero reference count at the time of or since the last transition between said first set of values and said second set of values for one or more of shadow page table generation counters, thereby effectively locking said shadow page tables implicitly via said shadow page table generation counters.

이 특허에 인용된 특허 (27)

Gannon Patrick M. (Poughkeepsie NY) Gum Peter H. (Poughkeepsie NY) Hough Roger E. (Highland NY) Murray Robert E. (Woodstock NY), Apparatus and method for TLB purge reduction in a multi-level machine system.
상세보기
Lim, Beng-Hong; Le, Bich C.; Bugnion, Edouard, Deferred shadowing of segment descriptors in a virtual machine monitor for a segmented computer architecture.
상세보기
Eberhard Raymond J. (Endicott NY) Goodin Douglas J. (Berkshire NY) Rundle ; Jr. Alfred T. (Endwell NY), Dynamic validity facility for fast purging of translation bypass buffers.
상세보기
Cohen,Ernest S., Lazy flushing of translation lookaside buffers.
상세보기
Agesen,Ole; Subrahmanyam,Pratap; Adams,Keith M., Maintaining coherency of derived data in a computer system.
상세보기
Willman,Bryan Mark; England,Paul; Peinado,Marcus, Memory isolation through address translation data edit control.
상세보기
Lopez-Aguado Herbert (Mountain View CA) Mehring Peter A. (Sunnyvale CA), Method and apparatus for the reduction of tablewalk latencies in a translation look aside buffer.
상세보기
Moore Charles R. (Austin TX) Muhich John S. (Austin TX), Method and system for maintaining translation lookaside buffer coherency in a multiprocessor data processing system.
상세보기
Agesen,Ole; Subrahmanyam,Pratap, Method and system for performing virtual to physical address translations in a virtual machine monitor.
상세보기
Sutton Peter G. (Yorktown Heights NY), Method for reducing translation look aside buffer purges in a multitasking system.
상세보기
Gaertner, Ute; Hagspiel, Norbert; Lehnert, Frank; Pfeffer, Erwin; Schelm, Kerstin, Method for sharing a translation lookaside buffer between CPUs.
상세보기
Bauman, Ellen Marie; Dosch, David Lee; Graham, Charles Scott; Holthaus, Brian Gerard; Lipps, Daniel Robert; Moertl, Daniel Frank; Movall, Paul Edward; Wetzel, Daniel Paul, Method of mapping multiple address spaces into single PCI bus.
상세보기
Nesheim William A. ; Guzovskiy Aleksandr, Multiprocessor system having mapping table in each node to map global physical addresses to local physical addresses of.
상세보기
Willman,Bryan Mark; England,Paul, Page granular curtained memory via mapping control.
상세보기
Saxena Sunil (Sunnyvale CA), Profile guided TLB and cache optimization.
상세보기
Gum Peter H. (Poughkeepsie NY) Hough Roger E. (Highland NY) Tallman Peter H. (Poughkeepsie NY) Curlee ; III Thomas O. (Poughkeepsie NY), Selective guest system purge control.
상세보기
DeLano Eric R. ; Buckley Michael A. ; Weir Duncan C., Software assisted hardware TLB miss handler.
상세보기
Laudon James P. ; Lenoski Daniel E., System and method for maintaining coherency of virtual-to-physical memory translations in a multiprocessor computer.
상세보기
Ginter Karl L. ; Shear Victor H. ; Spahn Francis J. ; Van Wie David M., System and methods for secure transaction management and electronic rights protection.
상세보기
Richard L. Frank ; Gopalan Arun ; Michael J. Cusson ; Daniel E. O'Shaughnessy, System for efficiently maintaining translation lockaside buffer consistency in a multi-threaded, multi-processor virtual memory system.
상세보기
Ginter Karl L. ; Shear Victor H. ; Sibert W. Olin ; Spahn Francis J. ; Van Wie David M., Systems and methods for secure transaction management and electronic rights protection.
상세보기
Ginter Karl L. ; Shear Victor H. ; Spahn Francis J. ; Van Wie David M., Systems and methods for secure transaction management and electronic rights protection.
상세보기
Chen,Xiaoxin; Munoz,Alberto J., TLB miss fault handler and method for accessing multiple page tables.
상세보기
Henry Stracovsky, Techniques for improving memory access in a virtual memory system.
상세보기
Bugnion Edouard ; Devine Scott W. ; Rosenblum Mendel, Virtual machine monitors for scalable multiprocessors.
상세보기
White Steven W. ; McWilliams G. Jeannette ; Kemp Jack Wayne, Virtual memory mapping method and system for memory management of pools of logical partitions for bat and TLB entries in.
상세보기
Neiger, Gilbert; Chou, Stephen; Cota-Robles, Erik; Jeyasingh, Stalinselvaraj; Kagi, Alain; Kozuch, Michael; Uhlig, Richard; Schoenberg, Sebastian, Virtual translation lookaside buffer.
상세보기

이 특허를 인용한 특허 (7)

Ben-Yehuda, Shmuel; Shalev, Leah; Wasserman, Orit Luba; Yassour, Ben-Ami, Direct memory access in a computing environment.
상세보기
Ohmacht, Martin, Generation-based memory synchronization in a multiprocessor system with weakly consistent memory accesses.
상세보기
Li, Yadong, NUMA-aware scaling for network devices.
상세보기
Vincent, Pradeep, Virtual memory management to reduce address cache flushing during I/O operations.
상세보기
Devine, Scott W.; Rogel, Lawrence S.; Bungale, Prashanth P.; Fry, Gerald A., Virtualization with in-place translation.
상세보기
Devine, Scott W.; Rogel, Lawrence S.; Bungale, Prashanth P.; Fry, Gerald A., Virtualization with multiple shadow page tables.
상세보기
Devine, Scott W.; Rogel, Lawrence S.; Bungale, Prashanth P.; Fry, Gerald A., Virtualization with shadow page tables.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Scalability of virtual TLBs for multi-processor virtual machines 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (27)

이 특허를 인용한 특허 (7)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Scalability of virtual TLBs for multi-processor virtual machines 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (27)

이 특허를 인용한 특허 (7)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트