System and method for coordinating cluster state information
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-012/00
G06F-015/167
출원번호
UP-0760484
(2007-06-08)
등록번호
US-7685358
(2010-04-21)
발명자
/ 주소
Larson, Richard O.
Rowe, Alan L.
sen Sarma, Joydeep
출원인 / 주소
NetApp, Inc.
대리인 / 주소
Cesari and McKenna, LLP
인용정보
피인용 횟수 :
10인용 특허 :
28
초록▼
A method for managing a cluster of file servers is disclosed. The method has the first step of writing coordinating information for a plurality of servers of the cluster of servers to a master mailbox record, the master mailbox record written to a specific location on each disk of a set of lock disk
A method for managing a cluster of file servers is disclosed. The method has the first step of writing coordinating information for a plurality of servers of the cluster of servers to a master mailbox record, the master mailbox record written to a specific location on each disk of a set of lock disks, the set of lock disks having a plurality of disks, the plurality of disks chosen so that in the event of failure of a server of the plurality of servers, at least one lock disk will be available to the remaining servers. The method has the second step of writing a second copy of eth coordinating information to the master mailbox record of the set of lock disks.
대표청구항▼
What is claimed is: 1. A method for managing a cluster of file servers, comprising: writing coordinating information for a plurality of servers of the cluster of servers to a master mailbox record, the master mailbox record written to a specific location on each disk of a set of lock disks, the set
What is claimed is: 1. A method for managing a cluster of file servers, comprising: writing coordinating information for a plurality of servers of the cluster of servers to a master mailbox record, the master mailbox record written to a specific location on each disk of a set of lock disks, the set of lock disks having a plurality of disks, the plurality of disks chosen so that in the event of failure of a server of the plurality of servers, at least one lock disk will be available to the remaining servers; and writing a second copy of the coordinating information to the master mailbox record of the set of lock disks. 2. The method of claim 1, further comprising: writing the master mailbox record to each disk of the plurality of lock disks. 3. The method of claim 1, further comprising: adding a disk to the set of lock disks by the following steps, storing a unique identifier associated with a new lock disk in a copy of the master mailbox record maintained in computer memory; setting a flag in the master mailbox record identifying that the new lock disk is in a referral state; writing the master mailbox record to the set of lock disks, including the new lock disk; setting a flag in a new version of the master mailbox record maintained in computer memory identifying that the new lock disk is online; and writing the new version of the master mailbox record to the set of lock disks, including the new lock disk. 4. The method of claim 1, further comprising: removing a selected lock disk from the set of lock disks by the following steps, setting a flag indicating that selected lock disk is no longer in an online state; the flag set in a copy of the master mailbox record maintained in computer memory; writing the copy of the master mailbox record to the set of lock disks; setting a flag identifying that the lock disk is no longer in a referral state, the flag set in a new copy of the master mailbox record maintained in computer memory; and writing the new master mailbox record maintained in computer memory to the set of lock disks. 5. The method of claim 1, further comprising: determining if the write operation of the master mailbox record to the plurality of lock disks failed for a particular lock disk; and removing the particular lock disk from the set of lock disks. 6. The method of claim 1, further comprising: writing the master mailbox record by a careful write procedure, the careful write procedure performing the steps of writing the first copy of the coordinating information to the master mailbox record and then writing the second copy of the coordinating information to the master mailbox record. 7. The method of claim 1, further comprising: storing data by the servers of the cluster of servers, the data stored on a set of storage devices, and more than one server has access to a particular storage device of the set of storage devices; and utilizing the coordinating information written to the master mailbox record to prevent partitioning problems with data stored by the servers of the cluster of servers on the set of storage devices. 8. The method of claim 1, further comprising: storing data by the servers of the cluster of servers, the data stored on a set of storage devices, and more than one server has access to a particular storage device of the set of storage devices; writing the master mailbox record to the set of lock disks by a controlling server of the cluster of servers; and reading the master mailbox record by all servers of the cluster of servers. 9. The method of claim 1, further comprising: seizing control of the set of lock disks by a particular file server of the cluster of file servers by the following steps, retrieving a master mailbox record from the set of lock disks; establishing reservations on the set of lock disks; setting a flag in the master mailbox record identifying that the particular file server is seizing control of the set of lock disks; setting a flag in a new copy of the master mailbox record maintained in computer memory indicating that the particular server is in control of the master mailbox record; and writing the new copy of the master mailbox record to the lock disk. 10. The method of claim 1, further comprising: writing the master mailbox record to a plurality of non lock disks which are accessible. 11. The method of claim 1, further comprising: reading a master mailbox record from each lock disk of the set of lock disks; and determining if the coordinating information is current. 12. A cluster of file servers as an apparatus, comprising: a processor to write coordinating information for a plurality of servers of the cluster of servers to a master mailbox record, the master mailbox record written to a specific location on each disk of a set of lock disks, the set of lock disks having a plurality of disks, the plurality of disks chosen so that in the event of failure of a server of the plurality of servers, at least one lock disk will be available to the remaining servers; and an operating system to write a second copy of the coordinating information to the master mailbox record of the set of lock disks. 13. The apparatus as in claim 12, further comprising: the operating system to write the master mailbox record to each disk of the plurality of lock disks. 14. The apparatus as in claim 12, further comprising: the operating system to add a new lock disk to the set of lock disks by the following steps, storing a unique identifier associated with a new lock disk in a copy of the master mailbox record maintained in computer memory; setting a flag in the master mailbox record identifying that the new lock disk is in a referral state; writing the master mailbox record to the set of lock disks, including the new lock disk; setting a flag in a new version of the master mailbox record maintained in computer memory identifying that the new lock disk is online; and writing the new version of the master mailbox record to the set of lock disks, including the new lock disk. 15. The apparatus as in claim 12, further comprising: the operating system to remove a selected lock disk from the set of lock disks by the following steps, setting a flag indicating that selected lock disk is no longer in an online state; the flag set in a copy of the master mailbox record maintained in computer memory; writing the copy of the master mailbox record to the set of lock disks; setting a flag identifying that the lock disk is no longer in a referral state, the flag set in a new copy of the master mailbox record maintained in computer memory; and writing the new master mailbox record maintained in computer memory to the set of lock disks. 16. The apparatus as in claim 12, further comprising: the operating system to determine if the write operation of the master mailbox record to the plurality of lock disks failed for a particular lock disk; and a processor to remove the particular lock disk from the set of lock disks. 17. The apparatus as in claim 12, further comprising: the operating system to write the master mailbox record by a careful write procedure, the careful write procedure performing the steps of writing the first copy of the coordinating information to the master mailbox record and then writing the second copy of the coordinating information to the master mailbox record. 18. The apparatus as in claim 12, further comprising: the operating system to store data by the servers of the cluster of servers, the data stored on a set of storage devices, and more than one server has access to a particular storage device of the set of storage devices; and the operating system to utilize the coordinating information written to the master mailbox record to prevent partitioning problems with data stored by the servers of the cluster of servers on the set of storage devices. 19. The apparatus as in claim 12, further comprising: the servers of the cluster of servers to store data, the data stored on a set of storage devices, and more than one server has access to a particular storage device of the set of storage devices; the operating system to write the master mailbox record to the set of lock disks by a controlling server of the cluster of servers; and the operating system to read the master mailbox record by all servers of the cluster of servers. 20. The apparatus as in claim 12, further comprising: a particular server of the cluster of file servers to seize control of the set of lock disks by the following steps, retrieving a master mailbox record from the set of lock disks; establishing reservations on the set of lock disks; setting a flag in the master mailbox record identifying that the particular file server is seizing control of the set of lock disks; setting a flag in a new copy of the master mailbox record maintained in computer memory indicating that the particular server is in control of the master mailbox record; and writing the new copy of the master mailbox record to the lock disk. 21. The apparatus as in claim 12, further comprising: the operating system to write the master mailbox record to a plurality of non lock disks which are accessible. 22. The apparatus as in claim 12, further comprising: the operating system to read a master mailbox record from each lock disk of the set of lock disks; and a processor to determine if the coordinating information is current. 23. A computer readable storage media, comprising: said computer readable storage media containing instructions for execution on a processor for the practice of a method of managing a cluster of file servers, the method having the steps of, writing coordinating information for a plurality of servers of the cluster of servers to a master mailbox record, the master mailbox record written to a specific location on each disk of a set of lock disks, the set of lock disks having a plurality of disks, the plurality of disks chosen so that in the event of failure of a server of the plurality of servers, at least one lock disk will be available to the remaining servers; writing a second copy of the coordinating information to the master mailbox record of the set of lock disks.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (28)
Oeda, Takashi; Honda, Kiyoshi; Matsunami, Naoto; Yoshida, Minoru, Computer system including a device with a plurality of identifiers.
Heath David M. (Nashua NH) Kraley Michael F. (Lexington MA) Pant Sangam (Winchester MA), Management facility for server entry and application utilization in a multi-node server configuration.
Byers Russell Francis,CAX ; Duchaine Joseph Marcel Gilles,CAX ; Schuett Michael Leonard,CAX ; Grootenboer Cornelius Jacob,GBX, Method and controller for controlling shutdown of a processing unit.
Ohran Richard S. ; Rollins Richard N. ; Ohran Michael R. ; Marsden Wally, Method for improving recovery performance from hardware and software errors in a fault-tolerant computer system.
Hitz David ; Malcolm Michael ; Lau James ; Rakitzis Byron, Method for maintaining consistent states of a file system and for creating user-accessible read-only copies of a file s.
Wallach Walter A. ; Findlay Bruce ; Pellicer Thomas J. ; Chrabaszcz Michael, Method for providing a fault tolerant network using distributed server processes to remap clustered network resources to other servers during server failure.
McCown Patricia M. (Cresskill NJ) Conway Timothy J. (Highland Park NJ) Jessen Karl M. (Bayonne NJ), Methods and apparatus for monitoring system performance.
Ekrot Alexander C. ; Singer James H. ; Hemphill John M. ; Autor Jeffrey S. ; Galloway William C. ; Alexander Dennis J., Multi-server fault tolerance using in-band signalling.
Hitz David (Sunnyvale CA) Schwartz Allan (Saratoga CA) Lau James (Cupertino CA) Harris Guy (Mountain View CA), Multiple facility operating system architecture.
Hitz David ; Schwartz Allan ; Lau James ; Harris Guy, Multiple software-facility component operating system for co-operative processor control within a multiprocessor computer system.
Row Edward J. (Mountain View CA) Boucher Laurence B. (Saratoga CA) Pitts William M. (Los Altos CA) Blightman Stephen E. (San Jose CA), Parallel I/O network file server architecture.
Row Edward J. (Mountain View CA) Boucher Laurence B. (Saratoga CA) Pitts William M. (Los Altos CA) Blightman Stephen E. (San Jose CA), Parallel I/O network file server architecture.
Beardsley Brent Cameron (Tucson AZ) Hathorn Roger Gregory (Tucson AZ) Holley Bret Wayne (Tucson AZ) Iskiyan James Lincoln (Tucson AZ), Remote copy system for setting request interconnect bit in each adapter within storage controller and initiating request.
Clowes Richard F. (New York NY) Tims Fred W. (Springfield Center NY), Workstation-implemented data storage re-routing for server fault-tolerance on computer networks.
Kerner, Matthew; Narayanaswamy, Swetha; Kess, Barbara; Meng, Yi; Shi, Weijuan; Berg, Michael Ryan; Winjum, Randall K., Detection and mitigation of disk failures.
Ziskind, Elisha; Sevigny, Marc; Rajagopal, Sridhar; Vavrick, Rostislav; Passerini, Ronald, Failure detection and recovery of host computers in a cluster.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.