Dedicated client-side signature generator in a networked storage system
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-017/30
G06F-011/14
출원번호
US-0916458
(2013-06-12)
등록번호
US-9218375
(2015-12-22)
발명자
/ 주소
Muller, Marcus S.
Ngo, David
출원인 / 주소
COMMVAULT SYSTEMS, INC.
대리인 / 주소
Knobbe, Martens, Olson & Bear, LLP
인용정보
피인용 횟수 :
16인용 특허 :
140
초록▼
A storage system according to certain embodiments includes a client-side signature repository that includes information representative of a set of data blocks stored in primary storage. During storage operations of a client, the system can generate signatures corresponding to data blocks that are be
A storage system according to certain embodiments includes a client-side signature repository that includes information representative of a set of data blocks stored in primary storage. During storage operations of a client, the system can generate signatures corresponding to data blocks that are being stored in primary storage. The system can store the generated signatures in the client-side signature repository along with information regarding the location of the corresponding data block within primary storage. As additional instances of the data block are stored in primary storage, the system can store the location of the additional instances in the client-side signature repository.
대표청구항▼
1. A method of maintaining a signature repository accessible by multiple client computing devices in a data storage system, the method comprising: tracking storage of a plurality of data units in a primary storage subsystem, the plurality of tracked data units corresponding to primary data generated
1. A method of maintaining a signature repository accessible by multiple client computing devices in a data storage system, the method comprising: tracking storage of a plurality of data units in a primary storage subsystem, the plurality of tracked data units corresponding to primary data generated by one or more applications executing on a plurality of client computing devices that form the primary storage subsystem, each data unit of the plurality of tracked data units forming at least a portion of at least one file stored in the primary storage subsystem,the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device,the primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and is configured to maintain secondary copies of at least some of the primary data;generating, by a signature agent executing on one or more processors in the primary storage subsystem, signatures corresponding to the plurality of tracked data units; andmaintaining a signature repository including a signature block for at least each unique signature of the generated signatures, where each signature block comprises: the unique signature; andone or more data unit entries, each entry corresponding to a distinct data unit of the plurality of tracked data units and associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit and that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature,wherein the first data unit forms at least a portion of a first file stored in the first primary data store and the second data unit forms at least a portion of a second file stored in the second primary data store. 2. The method of claim 1, wherein the first entry further includes location information identifying a location of the first data unit in the first primary data store and wherein the second entry further includes location information identifying a location of the data unit in the second primary data store. 3. The method of claim 1, further comprising: receiving a query including a plurality of signatures;comparing the plurality of signatures included in the query with signature blocks in the signature repository to identify a first set of signatures received in the query that correspond to data units that reside in a primary data store of at least one client computing device of the plurality of client computing devices; andfor at least some of the signatures in the first set of signatures, accessing the corresponding data units from the primary data store of the at least one client computing device. 4. The method of claim 3, further comprising, for the signatures included in the query that are not included in the first set of signatures, accessing the corresponding data units from the secondary storage subsystem. 5. The method of claim 4, wherein the plurality of signatures included in the query correspond to a set of data units which represent a backed up version of a set of the primary data that is to be restored to the first primary data store, and wherein at least some of the data units corresponding to the signatures in the first set of signatures are restored from the second client computing device. 6. The method of claim 1, further comprising: in response to receipt of instructions to backup at least a subset of the primary data of the first primary data store, comparing a set of signatures corresponding to data units in the subset of the primary data with entries in the signature repository, the data units in the subset of the primary data comprising at least the first data unit;based at least in part on the comparing, identifying a set of matching data units that match the data units in the subset of the primary data and that reside in at least one other primary data store other than the first primary data store, the set of matching data units comprising at least the second data unit; andaccessing the set of matching data units from the at least one other primary data store for retrieval as part of a backup set of data units. 7. The method of claim 6, further comprising: based at least in part on the comparing, identifying a set of data units of the data units in the subset of the primary data that do not have a corresponding matching data unit;accessing the set of matching data units from the first primary data store;associating the set of matching data units accessed from the at least one other primary data store with the set of data units accessed from the first primary data store to generate the backup set of data units corresponding to the data units in the subset of the primary data; andcommunicating the backup set to the secondary storage subsystem. 8. The method of claim 1, wherein the secondary storage subsystem comprises deduplicated data. 9. The method of claim 1, wherein the primary data store of at least one the plurality of client computing devices comprises deduplicated data. 10. A storage system, comprising: a signature repository agent executing on one or more processors in a primary storage subsystem,the primary storage subsystem comprising: a plurality of client computing devices; anda plurality of data agents executing on the plurality of client computing devices, the plurality of data agents configured to track storage of a plurality of data units in the primary storage subsystem, the plurality of data units corresponding to primary data generated by one or more applications executing on the plurality of client computing devices, each data unit forming at least a portion of at least on file stored in the primary storage subsystem,the primary data for each of the client computing devices stored in a primary data store associated with a respective client computing device, andthe primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and that is configured to maintain secondary copies of at least some of the primary data, andwherein the signature repository agent is configured to maintain a signature repository including a signature block for at least each unique signature generated by one or more signature agents, each signature block comprising: the unique signature; andone or more data unit entries, each entry corresponding to a distinct data unit associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature,wherein the first data unit forms at least a portion of a first file stored in the first primary data store and the second data unit forms at least a portion of a second file stored in the second primary data store. 11. The system of claim 10, wherein the first entry further includes location information identifying a location of the first data unit in the first primary data store and wherein the second entry further includes location information identifying a location of the data unit in the second primary data store. 12. The system of claim 10, wherein at least one of the one or more signature agents resides on each client computing device. 13. The system of claim 10, wherein the one or more signature agents execute on one or more computing devices that are separate from the plurality of client computing devices. 14. The system of claim 10, wherein the signature repository agent is further configured to: receive a query from the secondary storage subsystem, the query including a plurality of signatures;compare the plurality of signatures with signature blocks in the signature repository to identify a first set of signatures from the plurality of signatures that correspond to data units of the plurality of data units that reside in a primary data store of at least one client computing device of the plurality of client computing devices; andfor at least some signatures in the first set of signatures, request the corresponding data units from the at least one client computing device. 15. The system of claim 14, wherein the signature repository agent is further configured to request from the secondary storage subsystem the data units corresponding to signatures of the plurality of signatures that are not included in the first set of signatures. 16. The system of claim 15, wherein the plurality of signatures correspond to a set of data units which represent a backed up version of a set of the primary data that is to be restored to the first primary data store, and wherein at least some of the data units corresponding to the signatures in the first set of signatures are restored from the second client computing device. 17. The system of claim 10, wherein the signature repository agent is further configured to: in response to receipt of instructions to backup at least a subset of the primary data of the first primary data store, compare a backup set of signatures corresponding to data units in the subset of the primary data with signature blocks in the signature repository, the data units in the subset of the primary data comprising at least the first data unit;based at least in part on the comparison, identify a set of matching data units that match the data units in the subset of the primary data and that reside in at least one other primary data store other than the first primary data store, the set of matching data units comprising at least the second data unit; andrequest the set of matching data units from the at least one other primary data store for retrieval as part of a backup set of data units. 18. The system of claim 17, wherein the signature repository agent is further configured to: based at least in part on the comparison, identifying a set of data units of the data units in the subset of the primary data that do not have a corresponding matching data unit;request the set of data units from the first primary data store;associate the set of matching data units accessed from the at least one other primary data store with the set of data units accessed from the first primary data store to generate the backup set of data units corresponding to the data units in the subset of the primary data; andcommunicate the backup set of data units to the secondary storage subsystem. 19. The system of claim 10, wherein the secondary storage subsystem comprises deduplicated data. 20. The system of claim 10, wherein the primary data store of at least one of the plurality of client computing devices comprises deduplicated data. 21. A computer-readable, non-transitory storage medium having one or more computer-executable modules for maintaining a signature repository accessible by multiple client computing devices in a data storage system, the one or more computer-executable modules comprising: a first module in communication with a plurality of client computing devices that form a primary storage subsystem,the primary storage subsystem comprising: the plurality of client computing devices; anda plurality of data agents executing on the plurality of client computing devices, the plurality of data agents configured to track storage of a plurality of data units in the primary storage subsystem, the plurality of data units corresponding to primary data generated by one or more applications executing on the plurality of client computing devices, each data unit forming at least a portion of at least one file stored in the primary storage subsystem,the primary data for each of the client computing devices stored in a data store associated with a respective client computing device, andthe primary storage subsystem in communication with a secondary storage subsystem that is separate from the primary storage subsystem and is configured to maintain secondary copies of at least some of the primary data,wherein the first module is configured to maintain a signature repository including a signature block for at least each unique signature associated with the plurality of data units, where each signature block comprises: the unique signature; andone or more data unit entries, each entry corresponding to a distinct data unit associated with the unique signature that is stored in the primary storage subsystem and that is generated by an application of the applications executing on a distinct client computing device, wherein each entry identifies a client computing device of the plurality of client computing devices that stores the corresponding distinct data unit, wherein at least one of the signature blocks includes at least a first entry indicating that a first primary data store associated with a first client computing device of the plurality of client computing devices stores a first data unit that is associated with the unique signature and a second entry indicating that a second primary data store associated with a second client computing device of the plurality of client computing devices stores a second data unit that is associated with the unique signature.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (140)
Ranade, Dilip Madhusudan; Shelat, Radha; Kabra, Navin, Adaptive caching for a distributed file sharing system.
Yuval Ofek ; Zoran Cakeljic ; Samuel Krikler IL; Sharon Galtzur IL; Michael Hirsch IL; Dan Arnon ; Peter Kamvysselis, Apparatus and methods for copying, backing up, and restoring data using a backup segment size larger than the storage block size.
Griffin David (Maynard MA) Campbell Jonathan (Acton MA) Reilly Michael (Sterling MA) Rosenbaum Richard (Pepperell MA), Arrangement with cooperating management server node and network service node.
Nakano Toshio (Odawara JPX) Nozawa Masafumi (Odawara JPX) Kurano Akira (Odawara JPX) Hisano Kiyoshi (Odawara JPX) Hoshino Masayuki (Odawara JPX), Backup control method and system in data processing system using identifiers for controlling block data transfer.
Kitajima Hiroyuki (Yokohama) Yamamoto Akira (Yokohama) Doi Takashi (Hadano) Nozawa Masafumi (Odawara JPX), Buffered peripheral system and method for backing up and retrieving data to and from backup memory device.
Ludmila Cherkasova ; Martin F. Arlitt ; Richard J. Friedrich ; Tai Jin, Caching protocol method and system based on request frequency and relative storage duration.
Cole Leo J. (Raleigh NC) Frantz Curtis J. (Durham NC) Lee Jeannette (Raleigh NC) Ordanic Zvonimir (Raleigh NC) Plank Larry K. (Rochester MN), Centralized management in a computer network.
Carpenter Kelly S. (Fremont CA) Dearing Gerard M. (San Jose CA) Nick Jeffrey M. (Fishkill NY) Strickland Jimmy P. (Saratoga CA) Swanson Michael D. (Poughkeepsie NY) Wilkinson Wendell W. (Hyde Park NY, Coherence controls for store-multiple shared data coordinated by cache directory entries in a shared electronic storage.
Senator Steven T. ; Fuller Billy J., Computer system method and apparatus providing for various versions of a file without requiring data copy or log operati.
Fecteau Jean G. (Toronto NY CAX) Gdaniec Joseph M. (Vestal NY) Hennessy James P. (Endicott NY) MacDonald John F. (Vestal NY) Osisek Damian L. (Vestal NY), Computer system which supports asynchronous commitment of data.
Dunphy William E. (Westminster CO) Halladay Steven M. (Louisville CO) Moy Michael E. (Lafayette CO) Munro Frederick G. (Broomfield CO), Data storage and protection system.
Yanai Moshe (Framingham MA) Vishlitzky Natan (Brookline MA) Alterescu Bruno (Newton MA) Castel Daniel (Framingham MA) Shklarsky Gadi (Brookline MA), Data storage system controlled remote data mirroring with respectively maintained data indices.
Fortier Richard W. (Acton MA) Mastors Robert M. (Ayer MA) Taylor Tracy M. (Upton MA) Wallace John J. (Franklin MA), Digital data processor with improved backup storage.
Kenley Gregory (Northboro MA) Ericson George (Schrewsbury MA) Fortier Richard (Acton MA) Holland Chuck (Northboro MA) Mastors Robert (Ayer MA) Pownell James (Natick MA) Taylor Tracy (Upton MA) Wallac, Digital data storage system with improved data migration.
Christenson,Nikolai Paul; Fritchie,Scott Ernest Lystig; Larson,James Stephen, Electronic mail system with methodology providing distributed message store.
Xu Yikang ; Vahalia Uresh K. ; Jiang Xiaoye ; Gupta Uday ; Tzelnic Percy, File server system using file system storage, data movers, and an exchange of meta data among data movers for file locking and direct access to shared file systems.
Lagueux, Jr., Richard A.; Stave, Joel H.; Yeaman, John B.; Stevens, Brian E.; Higgins, Robert M.; Collins, James M., Graphical user interface for configuration of a storage system.
Urevig Paul D. ; Malnati James R. ; Ethen Donald J. ; Weber Herbert L., Grouping shared resources into one or more pools and automatically re-assigning shared resources from where they are not currently needed to where they are needed.
Cane David ; Hirschman David, High performance backup via selective file saving which can perform incremental backups and exclude files and uses a cha.
Barney Rock D. ; Schwols Keith ; Nelson Ellen M., Integration of a database into file management software for protecting, tracking and retrieving data.
Martin Charles W. (Richardson TX) Reid Fredrick S. (Plano TX) Forbus Gary L. (Dallas TX) Adams Steve M. (Plano TX) Shannon C. Patrick (Garland TX) Pirpich Eric A. (Garland TX), Mass data storage and retrieval system.
Kedem Nadav,ILX, Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information .
Long Robert M., Media element library with non-overlapping subset of media elements and non-overlapping subset of media element drives accessible to first host and unaccessible to second host.
Kullick Steven E. ; Spirakis Charles S. ; Titus Diane J., Method and apparatus for transferring archival data among an arbitrarily large number of computer devices in a networked.
Eastridge Lawrence E. (Tucson AZ) Kern Robert F. (Tucson AZ) Kern Ronald M. (Tucson AZ) Mikkelsen Claus W. (Morgan Hill CA) Ratliff James M. (Tucson AZ), Method and system for automated backup copy ordering in a time zero backup copy session.
Eastridge Lawrence E. (Tucson AZ) Kern Robert F. (Tucson AZ) Micka William F. (Tucson AZ) Mikkelsen Claus W. (Morgan Hill CA) Ratliff James M. (Tucson AZ), Method and system for automated termination and resumption in a time zero backup copy process.
Walter A. Hubis ; William G. Deitz, Method and system for controlling access share storage devices in a network environment by configuring host-to-volume mapping data structures in the controller memory for granting and denying access .
Chron, Edward Gustav; Menon, Jaishankar Moothedath, Method and system for providing consistent data modification information to clients in a storage system.
Aoyama Yuki,JPX ; Takahashi Toru,JPX ; Wakayama Satoshi,JPX, Method of and an apparatus for displaying version information and configuration information and a computer-readable recording medium on which a version and configuration information display program i.
Haustein, Nils; Klein, Craig A.; Troppens, Ulf; Winarski, Daniel J., Method of and system for deduplicating backed up data in a client-server environment.
Wahlert, Brian M; Berkowitz, Brian T; van Ingen, Catharine; Rangegowda, Dharshan; Jazayeri, Mike, Method, system, and apparatus for creating saved searches and auto discovery groups for a data protection system.
Palliyil, Sudarshan; Venkateshamurthy, Shivakumara; Vijayaraghavan, Srinivas Belur; Aswathanarayana, Tejasvi, Methods, apparatus and computer programs for enhanced access to resources within a network.
Pisello Thomas (De Bary FL) Crossmier David (Casselberry FL) Ashton Paul (Oviedo FL), Network management system having virtual catalog overview of files distributively stored across network domain.
Prahlad, Anand; May, Andreas; Lunde, Norman R.; Zhou, Lixin; Kumar, Avinash; Ngo, David, Snapshot storage and management system with indexing and user interface.
Crockett Robert N. (Tucson AZ) Kern Ronald M. (Tucson AZ) Micka William F. (Tucson AZ), Software directed microcode state save for distributed storage controller.
Friend,John; Belshe,Michael; Collins,Roger; Bennett,Mike, System and method for full wireless synchronization of a data processing apparatus with a messaging system.
Mutalik Madhav ; Senie Faith M., System and method for performing file-handling operations in a digital data processing system using an operating system-independent file map.
Moulton, Gregory Hagan, System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences.
Patel, Sujal M.; Mikesell, Paul A., System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system.
Huai ReiJane (Old Brookville NY) Daly Robert (Ronkonkoma NY) Curti Walter (Dix Hills NY) Mohan Deepak (Huntington NY) Chueh James Kuang-Ru (Bayside NY) Louie Larry (Forest Hills NY), System and parallel streaming and data stripping to back-up a network.
Stoppani ; Jr. Peter (Woodinville WA), System for allocating storage spaces based upon required and optional service attributes having assigned piorities.
Bamford Roger J. (Woodside CA) Howard Forrest W. (Berkeley CA) Kabcenell Dirk A. (Portola Valley CA) Miner Robert N. (San Francisco CA), System for database integrity with multiple logs assigned to client subsets.
Flynn Rex A. (Belmont MA) Anick Peter G. (Marlboro MA), System for reconstructing prior versions of indexes using records indicating changes between successive versions of the.
Saether Christian D. (Seattle WA) Stoppani ; Jr. Peter (Woodinville WA), System of device independent file directories using a tag between the directories and file descriptors that migrate with.
Prahlad, Anand; Schwartz, Jeremy A.; Ngo, David; Brockway, Brian; Muller, Marcus S., Systems and methods for classifying and transferring information in a storage network.
Borghetti, Stefano; Sgro', Antonio Mario; Corte, Gianluca Della; Gianfagna, Leonida, Thread based view and archive for simple mail transfer protocol (SMTP) clients devices and methods.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.