[특허]System and method for establishing bi-directional failover in a two node cluster

System and method for establishing bi-directional failover in a two node cluster 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-011/00
출원번호	US-0858418 (2004-06-01)
등록번호	US-7478263 (2009-01-13)
발명자 / 주소	Kownacki,Ronald William Bertschi,Jason S.
출원인 / 주소	Network Appliance, Inc.
대리인 / 주소	Cesari and McKenna LLP
인용정보	피인용 횟수 : 56 인용 특허 : 16

초록 ▼

A system and method for permitting bi-directional failover in two node clusters utilizing quorum-based data replication. In response to detecting an error in its partner the surviving node establishes itself as the primary of the cluster and sets a first persistent state in its local unit. A temporary epsilon value for quorum voting purposes is then assigned to the surviving node, which causes it to be in quorum. A second persistent state is stored in the local unit and the surviving node comes online as a result of being in quorum.

대표청구항 ▼

What is claimed is: 1. A method for providing bi-directional failover for data replication services in a two node cluster, comprising: detecting a failure of one of the nodes; and in response to detecting the failure, exiting a conventional quorum state and entering a high availability state, wherein in the high availability state a single node is designated as a stand alone node that is a full read/write replica for the data replication services of the cluster, thereby enabling management services reliant on updates for replicated data to function normally, and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum and neither node is initially configured with an epsilon value. 2. The method of claim 1 wherein the failure in one of the nodes comprises a failure in communication between the nodes. 3. The method of claim 1 wherein the replicated data/services comprise a VFS location database. 4. The method of claim 1 wherein the replicated data/services comprises a management framework. 5. The method of claim 1 wherein the replicated data/services comprises a high availability manager. 6. The method of claim 1, wherein each node is healthy when the node is active and responding to one or more client requests. 7. The method of claim 1, wherein the epsilon value gives greater weight in voting to the node assigned the epsilon value. 8. A method for providing a bi-directional failover in a cluster comprising a first node and a second node, comprising: providing the first node and the second node configured in a conventional quorum state, wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum and neither node is initially configured with an epsilon value; detecting, by the first node, an error condition on the second node; setting, by the first node, a local cached activity lock identifying the first node as active in the cluster; setting a first persistent state in a local unit of the first node; assigning a temporary epsilon value to the first node, wherein the first node enters into quorum as a result of the temporary epsilon value; and setting a second persistent state in the local unit of the first node, wherein the second persistent states is a high availability state where the first node is designated as a stand alone node that is a full read/write replica of the cluster. 9. The method of claim 8 wherein the step of detecting the error condition further comprises: detecting a lack of a heartbeat signal from the second node. 10. The method of claim 8 wherein the first persistent state comprises a HA_PREACTIVE state. 11. The method of claim 8 wherein the local unit comprises a storage device. 12. The method of claim 8 further comprising: detecting, by the first node, the post-failure presence of the second node; performing a resynchronization routine between the first and second nodes; removing the temporary epsilon value from the first node; clearing the local cached activity lock from the local unit of the first node; clearing an activity lock from a D-blade; and wherein the first and second nodes are in quorum and capable of processing write operations. 13. The method of claim 12 wherein the step of performing a resynchronization routine between the first and second nodes exchanges deltas in order to ensure that both database replicas (RDB) on the first and second node are identical. 14. A computer readable medium for providing a bi-directional failover in a cluster comprising a first node and a second node, the computer readable medium including program instructions for performing the steps of: providing the first node and the second node configured in a conventional quorum state, wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum and neither node is initially configured with an epsilon value; detecting, by the first node, an error condition on the second node; setting, by the first node, a local cached activity lock identifying the first node as a primary of the cluster; setting a first persistent state in a local unit of the first node; assigning a temporary epsilon value to the first node, wherein the first node enters into quorum as a result of the temporary epsilon value; and setting a second persistent state in the local unit of the first node, wherein the second persistent states is a high availability state where the first node is designated as a stand alone node that is a full read/write replica of the cluster. 15. The computer readable medium of claim 14 wherein the computer readable medium further includes program instructions for performing the steps of: detecting, by the first node, the post-failure presence of the second node; performing a resynchronization routine between the first and second nodes; removing the temporary epsilon value from the first node; clearing the local cached activity lock from the local unit of the first node; clearing an activity lock from a D-blade; and wherein the first and second nodes are in quorum and capable of processing write operations. 16. A system for providing a bi-directional failover in a cluster comprising a first node and a second node, the system comprising: a storage operating system executed by a processor on the first node and the storage operating system having a replicated database (RDB), the RDB comprising a quorum manager configured to assign a temporary epsilon value to the first node in response to detecting an error condition in the second node, the temporary epsilon causing the first node to be in quorum and to allow the second node to come online to form the cluster between the first node and the second node, wherein the RDB further comprises a recovery manager configured to set a lock in a data structure identifying the first node as the owner of an HA activity lock in the cluster and further configured to set a first persistent state value in a local unit of the first node. 17. The system of claim 16 wherein the recovery manager is further configured to set a second persistent state to indicate that the recovery manager is in a HA_ACTIVE state in the local unit of the first node in response to the quorum manager assigning the temporary epsilon to the first node and thereby establishing quorum. 18. A computer readable medium for providing bi-directional failover among nodes of a two node replicated data cluster, the computer readable medium including program instructions for performing the steps of: detecting a failure of one of the nodes; and in response to detecting the failure, exiting a conventional quorum state and entering a high availability state, wherein in the high availability state a single node is designated as a full read/write replica within the cluster data replication service, the full read/write replica modifying configuration information relating to one or more replicated services provided by the replicated services cluster, and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum and neither node is initially configured with an epsilon value. 19. A system to provide bi-directional failover for data replication services in a two node cluster, comprising: in response to detecting a failure, a disk element module executed by a processor, the disk element module configured, to designate a first node of the two nodes as a stand alone node that is a full read/write replica for the data replication services of the cluster, thereby enabling management services reliant on updates for replicated data to function normally; and a quorum manager configured to assign a temporary epsilon value to the first node in response to detecting an error condition in a second node, the temporary epsilon causing the first node to be in quorum and to allow the second node to come online to form the cluster between the first node and the second node. 20. The system of claim 19 wherein the failure in one of the nodes comprises a failure in communication between the nodes. 21. The system of claim 19 wherein the replicated data/services comprise a VFS location database. 22. The system of claim 19 wherein the replicated data/services comprises a management framework. 23. The system of claim 19 wherein the replicated data/services comprises a high availability manager. 24. The system of claim 19, further comprising: an operating system to exit a conventional quorum state and enter a high availability state. 25. A method for providing bi-directional failover for data replication services in a two node cluster, comprising: detecting a failure of one of the nodes; and in response to detecting the failure, exiting a conventional quorum state and entering a high availability state, wherein a single node is designated as a full read/write replica for the data replication services of the cluster by storing in a lock associated with the single node in a disk element, thereby enabling management services reliant on updates for replicated data to function normally, and wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum and neither node is initially configured with an epsilon value. 26. A method for providing a bi-directional failover in a cluster comprising a first node and a second node, comprising: providing the first node and the second node configured in a conventional quorum state, wherein the conventional quorum state requires a majority of the nodes to be healthy to have quorum and neither node is initially configured with an epsilon value; detecting, by the first node, an error condition on the second node; setting, by the first node, a local cached activity lock identifying the first node as active in the cluster; setting a first persistent state in a local unit of the first node; assigning a temporary epsilon value to the first node, wherein the first node enters into quorum as a result of the temporary epsilon value; setting a second persistent state in the local unit of the first node, wherein the second persistent states is a high availability state where the first node is designated as a stand alone node that is a full read/write replica of the cluster; detecting, by the first node, the post-failure presence of the second node; performing a resynchronization routine between the first and second nodes; and removing the temporary epsilon value from the first node. 27. The method of claim 26 wherein the step of detecting the error condition further comprises: detecting a lack of a heartbeat signal from the second node. 28. The method of claim 26 wherein the first persistent state comprises a HA_PREACTIVE state. 29. The method of claim 26 wherein the local unit comprises a storage device. 30. The method of claim 26 further comprising: clearing the local cached activity lock from the local unit of the first node; clearing an activity lock from a D-blade; and wherein the first and second nodes are in quorum and capable of processing write operations. 31. The method of claim 26 wherein the step of performing a resynchronization routine between the first and second nodes exchanges deltas in order to ensure that both database replicas (RDB) on the first and second node are identical.

이 특허에 인용된 특허 (16)

Oeda, Takashi; Honda, Kiyoshi; Matsunami, Naoto; Yoshida, Minoru, Computer system including a device with a plurality of identifiers.
상세보기
Kumar, Krishna; Murphy, Declan J.; Hisgen, Andrew L., Controlled take over of services by remaining nodes of clustered computing system.
상세보기
Schoenthal Scott ; Rowe Alan ; Kleiman Steven R., Coordinating persistent status information with multiple file servers.
상세보기
Kolovson Curtis P., Fast database failover.
상세보기
Major Drew (Orem UT) Powell Kyle (Orem UT) Neibaur Dale (Orem UT), Fault tolerant computer system.
상세보기
Quach, Nhon, Firmware mechanism for correcting soft errors.
상세보기
Byers Russell Francis,CAX ; Duchaine Joseph Marcel Gilles,CAX ; Schuett Michael Leonard,CAX ; Grootenboer Cornelius Jacob,GBX, Method and controller for controlling shutdown of a processing unit.
상세보기
Ohran Richard S. ; Rollins Richard N. ; Ohran Michael R. ; Marsden Wally, Method for improving recovery performance from hardware and software errors in a fault-tolerant computer system.
상세보기
Wallach Walter A. ; Findlay Bruce ; Pellicer Thomas J. ; Chrabaszcz Michael, Method for providing a fault tolerant network using distributed server processes to remap clustered network resources to other servers during server failure.
상세보기
Mott Jack E. (Idaho Falls ID), Method of system state analysis.
상세보기
Skinner,Steven George; Phuong,Matthew Ky; Preston,Mark Lynn, Method to revive and reconstitute majority node set clusters.
상세보기
McCown Patricia M. (Cresskill NJ) Conway Timothy J. (Highland Park NJ) Jessen Karl M. (Bayonne NJ), Methods and apparatus for monitoring system performance.
상세보기
Ekrot Alexander C. ; Singer James H. ; Hemphill John M. ; Autor Jeffrey S. ; Galloway William C. ; Alexander Dennis J., Multi-server fault tolerance using in-band signalling.
상세보기
Cramer, Samuel M.; Schoenthal, Scott, Negotiating takeover in high availability cluster.
상세보기
Gunda,Kalyan C.; Herr,Brian D., Two node virtual shared disk cluster recovery.
상세보기
Clowes Richard F. (New York NY) Tims Fred W. (Springfield Center NY), Workstation-implemented data storage re-routing for server fault-tolerance on computer networks.
상세보기

이 특허를 인용한 특허 (56)

MacDonald McAlister, Grant Alexander; Milovanovic, Milovan, Cloning and recovery of data volumes.
상세보기
MacDonald McAlister, Grant Alexander; Milovanovic, Milovan, Cloning and recovery of data volumes.
상세보기
McAlister, Grant Alexander MacDonaldr; Milovanovic, Milovan, Cloning and recovery of data volumes.
상세보기
Groover, Michael P.; Han, Robin; Lin, Edward H.; Su, Yan; Tang, Wei; Zhao, Ming Zhi; Zhou, Yi, Cloud infrastructure for reducing storage facility code load suspend rate by redundancy check.
상세보기
Groover, Michael P.; Han, Robin; Lin, Edward H.; Su, Yan; Tang, Wei; Zhao, Ming Zhi; Zhou, Yi, Cloud infrastructure for reducing storage facility code load suspend rate by redundancy check.
상세보기
Chen, Wei; Teodosiu, Dan; Teodorescu, Cristian George; Liu, Xuezheng; Zhang, Zheng, Collection-based object replication.
상세보기
Fukuyama, Masayuki; Nakayama, Jun; Masuda, Kouji, Control method for information processing system, information processing system, and program.
상세보기
Sivasubramanian, Swaminathan; McAlister, Grant Alexander MacDonald; Franklin, Paul David; Sheth, Rajesh Sudhakar; Horsley, James, Control service for data management.
상세보기
Sivasubramanian, Swaminathan; McAlister, Grant A. M.; Franklin, Paul David; Sheth, Rajesh Sudhakar; Horsley, James, Control service for relational data management.
상세보기
Sivasubramanian, Swaminathan; McAlister, Grant Alexander MacDonald; Franklin, Paul David; Sheth, Rajesh Sudhakar; Horsley, James, Control service for relational data management.
상세보기
Hasha, Richard L.; Xun, Lu; Kakivaya, Gopala Krishna R.; Malkhi, Dahlia, Data consistency within a federation infrastructure.
상세보기
Hasha, Richard L.; Xun, Lu; Kakivaya, Gopala Krishna R.; Malkhi, Dahlia, Data consistency within a federation infrastructure.
상세보기
Lipcon, Todd; Myers, Aaron T.; Collins, Eli, Data node fencing in a distributed file system.
상세보기
Banka, Deepti; Usgaonkar, Ameya Prakash, Distributed control protocol for high availability in multi-node storage cluster.
상세보기
Epstein, Amir; Factor, Michael E.; Kolodner, Elliot K.; Sotnikov, Dmitry, Durability and availability evaluation for distributed storage systems.
상세보기
Nishanov, Gor; D'Amato, Andrea; Tamhane, Amitabh Prakash; Dion, David A., Dynamic quorum for distributed systems.
상세보기
McAlister, Grant Alexander MacDonald; Sivasubramanian, Swaminathan, Failover and recovery for replicated data instances.
상세보기
McAlister, Grant Alexander MacDonald; Sivasubramanian, Swaminathan, Failover and recovery for replicated data instances.
상세보기
Sivasubramanian, Swaminathan; McAlister, Grant Alexander MacDonald, Failover and recovery for replicated data instances.
상세보기
Dennehy, Mark; Mooney, Robert, Generating database sequences in a replicated database environment.
상세보기
Johri, Abhishek; Sato, Takahito; Saito, Hideo; Kawaguchi, Tomohiro, Information storage system.
상세보기
Hasha, Richard L.; Xun, Lu; Kakivaya, Gopala Krishna R., Inter-proximity communication within a rendezvous federation.
상세보기
Hasha, Richard L.; Xun, Lu; Kakivaya, Gopala Krishna R., Inter-proximity communication within a rendezvous federation.
상세보기
Hasha, Richard L.; Xun, Lu; Kakivaya, Gopala Krishna R.; Malkhi, Dahlia, Maintaining consistency within a federation infrastructure.
상세보기
McAlister, Grant A. M., Managing security groups for data instances.
상세보기
McAlister, Grant Alexander MacDonald, Managing security groups for data instances.
상세보기
Srivas, Mandayam C.; Ravindra, Pindikura; Saradhi, Uppaluri Vijaya; Pande, Arvind Arun; Sanapala, Chandra Guru Kiran Babu; Renu, Lohit Vijaya; Vellanki, Vivekanand; Kavacheri, Sathya; Hadke, Amit, Map-reduce ready distributed file system.
상세보기
Srivas, Mandayam C.; Ravindra, Pindikura; Saradhi, Uppaluri Vijaya; Pande, Arvind Arun; Sanapala, Chandra Guru Kiran Babu; Renu, Lohit Vijaya; Vellanki, Vivekanand; Kavacheri, Sathya; Hadke, Amit, Map-reduce ready distributed file system.
상세보기
Srivas, Mandayam C.; Ravindra, Pindikura; Saradhi, Uppaluri Vijaya; Pande, Arvind Arun; Sanapala, Chandra Guru Kiran Babu; Renu, Lohit Vijaya; Vellanki, Vivekanand; Kavacheri, Sathya; Hadke, Amit Ashoke, Map-reduce ready distributed file system.
상세보기
Srivas, Mandayam C.; Ravindra, Pindikura; Saradhi, Uppaluri Vijaya; Pande, Arvind Arun; Sanapala, Chandra Guru Kiran Babu; Renu, Lohit Vijaya; Vellanki, Vivekanand; Kavacheri, Sathya; Hadke, Amit Ashoke, Map-reduce ready distributed file system.
상세보기
Srivas, Mandayam C.; Ravindra, Pindikura; Saradhi, Uppaluri Vijaya; Pande, Arvind Arun; Sanapala, Chandra Guru Kiran Babu; Renu, Lohit Vijaya; Vellanki, Vivekanand; Kavacheri, Sathya; Hadke, Amit Ashoke, Map-reduce ready distributed file system.
상세보기
Srivas, Mandayam C.; Ravindra, Pindikura; Saradhi, Uppaluri Vijaya; Pande, Arvind Arun; Sanapala, Chandra Guru Kiran Babu; Renu, Lohit Vijaya; Vellanki, Vivekanand; Kavacheri, Sathya; Hadke, Amit Ashoke, Map-reduce ready distributed file system.
상세보기
Sharma, Sumit; Katkar, Amol Shivram, Method and apparatus for partitioning a computer cluster through coordination point devices.
상세보기
Hisgen, Andrew L.; Früauf, Thorsten; Roush, Ellard T.; Solter, Nicholas A., Method and system for a weak membership tie-break.
상세보기
De Gaetano, Rosella, Method, system and computer program for a secure backup license server in a license management system.
상세보기
Sivasubramanian, Swaminathan; McAlister, Grant Alexander MacDonald; Milovanovic, Milovan, Monitoring and automatic scaling of data volumes.
상세보기
McAlister, Grant Alexander MacDonald; Sivasubramanian, Swaminathan; Hunter, Jr., Barry B.; Brazil, Silas M., Monitoring of replicated data instances.
상세보기
Sivasubramanian, Swaminathan; McAlister, Grant Alexander MacDonald; Hunter, Jr., Barry B.; Brazil, Silas M., Monitoring of replicated data instances.
상세보기
Verdoorn, Jr., William Garrett; Walls, Andrew Dale, Multi-node configuration of processor cards connected via processor fabrics.
상세보기
Regni, Giorgio; Rancurel, Vianney; Pineau, David; Gimenez, Guillaume; Saffroy, Jean-Marc; Artuso, Benoit; Verma, Mudit, Object storage system capable of performing snapshots, branches and locking.
상세보기
Critchley, Craig A.; Wortendyke, David A.; Marucheck, Michael J.; Hasha, Richard L., Optimizing access to federation infrastructure-based resources.
상세보기
McAlister, Grant Alexander MacDonald; Sivasubramanian, Swaminathan, Provisioning and managing replicated data instances.
상세보기
Groover, Michael P.; Han, Robin; Lin, Edward H.; Su, Yan; Tang, Wei; Zhao, Ming Zhi; Zhou, Yi, Reducing storage facility code load suspend rate by redundancy check.
상세보기
Kakivaya, Gopala Krishna R.; Hasha, Richard L.; Rodeheffer, Thomas Lee, Rendezvousing resource requests with corresponding resources.
상세보기
Tarta, Mihail Gavril; Kakivaya, Gopal; Subbarayalu, Preetha Lakshmi, Replicable differential store data structure.
상세보기
Sheth, Rajesh Sudhakar; Warman, Leon Robert; Gangadhar, Narayan, Self-service administration of a database.
상세보기
Sivasubramanian, Swaminathan; McAlister, Grant Alexander MacDonald; Sheth, Rajesh Sudhakar, Self-service configuration for data environment.
상세보기
Aguilera, Marcos K.; Veitch, Alistair; Spence, Susan, Snapshots in distributed storage systems.
상세보기
Thiel, Gregory; Kuppusamy, Manoharan; Bansal, Yogesh, Split brain protection in computer clusters.
상세보기
Kakivaya, Gopala Krishna R.; Xun, Lu; Hasha, Richard L., Subfederation creation and maintenance in a federation infrastructure.
상세보기
Brown, Cory D.; Fernandez, Anthony, System and method for handling database failover.
상세보기
Oliver, Brian K.; Peralta, Patrick; Mackin, Paul F.; Arliss, Noah, System and method for supporting failover during synchronization between clusters in a distributed data grid.
상세보기
Oliver, Brian K.; Peralta, Patrick; Mackin, Paul F.; Arliss, Noah, System and method for supporting parallel asynchronous synchronization between clusters in a distributed data grid.
상세보기
Oliver, Brian K.; Peralta, Patrick; Mackin, Paul F.; Arliss, Noah, System and method for supporting partition level journaling for synchronizing data in a distributed data grid.
상세보기
Jain, Sandeep; Tiwary, Prakash Chandra; Garg, Aparna, System and method for synchornisation of data and recovery of failures during synchronization between two systems.
상세보기
Panasko, Brian; Snyder, Tom; Moore, Chad, Techniques for maintaining device coordination in a storage cluster system.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

System and method for establishing bi-directional failover in a two node cluster 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (16)

이 특허를 인용한 특허 (56)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

System and method for establishing bi-directional failover in a two node cluster 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (16)

이 특허를 인용한 특허 (56)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트