Intelligent data sourcing in a networked storage system
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-017/30
G06F-011/14
출원번호
US-0916467
(2013-06-12)
등록번호
US-9218376
(2015-12-22)
발명자
/ 주소
Muller, Marcus S.
Ngo, David
출원인 / 주소
COMMVAULT SYSTEMS, INC.
대리인 / 주소
Knobbe, Martens, Olson & Bear, LLP
인용정보
피인용 횟수 :
17인용 특허 :
140
초록▼
A storage system according to certain embodiments includes a repository of client-side data block signature information representative of a set of data blocks stored in a primary storage subsystem. In some cases, the system sources data blocks for secondary copy and restore operations from the prima
A storage system according to certain embodiments includes a repository of client-side data block signature information representative of a set of data blocks stored in a primary storage subsystem. In some cases, the system sources data blocks for secondary copy and restore operations from the primary storage subsystem instead of the secondary storage subsystem. Where multiple primary storage components (e.g., multiple client computing devices) contain copies of a data blocks involved in a secondary copy or restore operation, the system can decide which client to source the data block from based on sourcing criteria.
대표청구항▼
1. A method of sourcing data from storage associated with a pool of computing devices during a data storage operation associated with one of the computing devices in the pool, the method comprising: obtaining signatures corresponding to data units that form a data set associated with a data storage
1. A method of sourcing data from storage associated with a pool of computing devices during a data storage operation associated with one of the computing devices in the pool, the method comprising: obtaining signatures corresponding to data units that form a data set associated with a data storage operation, the data set corresponding to a version of one or more files of primary data of a first computing device in a pool of a plurality of computing devices, each respective computing device in the pool storing primary data generated by one or more software applications executing on the respective computing device, the primary data stored in at least one storage device associated with the respective computing device,wherein the storage devices of the computing devices in the pool store a plurality of data units of primary data including at least the data set stored in the at least one storage device of the first computing device,wherein each file of a plurality of files of primary data stored in the storage devices comprises at least one data unit of the plurality of data units,wherein at least a first data unit of the data set forms at least a portion of a first file of primary data stored in the at least one storage device of the first computing device and a second data unit of the plurality of data units matches the first data unit and forms at least a portion of a second file of primary data stored in the at least one storage device of a second computing device of the plurality of computing devices, andwherein the first file and the second file are generated by the one or more software applications executing on the first computing device and the second computing device, respectively;populating, by one or more processors, a shared signature repository that includes: signatures corresponding to at least each data unit of the plurality of data units, wherein a first signature corresponds to the first data unit and the second data unit; andfor each signature included in the signature repository, an indication as to one or more of the computing devices whose at least one storage device includes an independently generated data unit that corresponds to the signature, wherein each independently generated data unit forms at least a portion of a distinct file residing on the respective storage device, and wherein the shared signature repository includes at least a first indication that indicates a first location of the first data unit in the at least one storage device of the first computing device and a second location of the second data unit in the at least one storage device of the second computing device;comparing the obtained signatures, including a signature of the first data unit, with the signature repository to identify one or more matching data units, including the second data unit, stored in the respective at least one storage device of the computing devices in the pool, wherein each of the one or more matching data units forms at least a portion of a read/write file residing in the respective storage device and is stored in a native format of the respective software application that generated the respective matching data unit;consulting, by one or more processors, a priority policy; andbased on the priority policy, and for at least the first data unit in the data set, determining to access the second data unit rather than the first data unit for the data storage operation. 2. The method of claim 1, wherein the priority policy includes an indication as to a relative priority of one or more of the computing devices with respect to one or more others of the computing devices. 3. The method of claim 1, wherein the data storage operation comprises a backup copy operation in which a copy of the data set is stored in secondary storage that is separate from the pool of computing devices and from each at least one storage device of the respective computing devices. 4. The method of claim 1, wherein the data storage operation comprises a restore operation in which the data set is restored to the at least one storage device associated with the first client computing device. 5. The method of claim 4, wherein at least some of the data units in the data set are sourced from secondary storage which is separate from the pool of computing devices and is separate from each at least one storage device of the respective computing devices. 6. The method of claim 1, wherein said consulting is performed by a data sourcing module executing on one or more processors of a computing device that is separate from the pool, and wherein the signature repository is separate from each of the at least one storage devices. 7. The method of claim 1, wherein at least 10 percent of the data units are sourced from the at least one storage device of computing devices in the pool other than the first computing device. 8. The method of claim 1, wherein at least 25 percent of the data units are sourced from the at least one storage device of computing devices in the pool other than the first computing device. 9. The method of claim 1, wherein at least 50 percent of the data units are sourced from the at least one storage device of computing devices in the pool other than the first computing device. 10. The method of claim 1, wherein all of the data units are sourced from the at least one storage device of one or more computing devices in the pool other than the first computing device. 11. A storage system for sourcing data from storage associated with a pool of computing devices during a data storage operation associated with one of the computing devices in the pool, the storage system comprising: a global signature repository including: signatures corresponding to at least each data unit of a plurality of data units of primary data, the plurality of data units stored in storage devices associated with a plurality of computing devices in a pool,wherein each file of a plurality of files of primary data stored in the storage devices comprises at least one data unit of the plurality of data units,wherein at least a first data unit of the plurality of data units forms at least a portion of a first file of primary data stored in a first storage device associated with a first computing device of the plurality of computing devices and a second data unit of the plurality of data units matches the first data unit and forms at least a portion of a second file of primary data stored in a second storage device associated with a second computing device of the plurality of computing devices,wherein the first file and the second file are generated by one or more software applications executing on the first computing device and the second computing device, respectively, andwherein a first signature of the signatures corresponds to the first data unit and the second data unit; andfor each signature included in the signature repository, an indication as to one or more of the plurality of computing devices whose at least one storage device includes an independently generated data unit that corresponds to the signature, wherein each independently generated data unit forms at least a portion of a distinct file residing on the respective storage device, and wherein the global signature repository includes at least a first indication that indicates a first location of the first data unit in the first storage device and a second location of the second data unit in the second storage device; anda repository agent executing in one or more processors and configured to: obtain signatures corresponding to data units that form a data set associated with a data storage operation, the data set corresponding to a version of one or more files of primary data of the first computing device and including the first data unit;compare the obtained signatures, including a signature of the first data unit, with the signature repository to identify one or more matching data units, including the second data unit, stored in the respective at least one storage device of the computing devices in the pool, wherein each of the one or more matching data units forms at least a portion of a read/write file residing in the respective storage device and is stored in a native format of the respective software application that generated the respective matching data unit;consult a priority policy; andbased on the priority policy, and for at least the first data unit in the data set, determine to access the second data unit rather than the first data unit for the data storage operation. 12. The system of claim 11, wherein the priority policy includes an indication as to a relative priority of one or more of the computing devices with respect to one or more others of the computing devices. 13. The system of claim 11, wherein the data storage operation comprises a backup copy operation in which a copy of the data set is stored in secondary storage that is separate from the plurality of computing devices and from each at least one storage device of the respective computing devices. 14. The system of claim 11, wherein the data storage operation comprises a restore operation in which the data set is restored to the at least one storage device associated with the first client computing device. 15. The system of claim 14, wherein at least some of the data units in the data set are sourced from secondary storage which is separate from the plurality of computing devices and is separate from each at least one storage device of the respective computing devices. 16. The system of claim 11, wherein the repository agent executes on one or more processors of a computing device that is separate from the pool and wherein the signature repository is separate from the storage devices. 17. The system of claim 11, wherein at least 10 percent of the data units are sourced from the at least one storage device of computing devices in the pool other than the first computing device. 18. The system of claim 11, wherein at least 25 percent of the data units are sourced from the at least one storage device of computing devices in the pool other than the first computing device. 19. The system of claim 11, wherein at least 50 percent of the data units are sourced from the at least one storage device of computing devices in the pool other than the first computing device. 20. The system of claim 11, wherein all of the data units are sourced from the at least one storage device of one or more computing devices in the pool other than the first computing device.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (140)
Ranade, Dilip Madhusudan; Shelat, Radha; Kabra, Navin, Adaptive caching for a distributed file sharing system.
Yuval Ofek ; Zoran Cakeljic ; Samuel Krikler IL; Sharon Galtzur IL; Michael Hirsch IL; Dan Arnon ; Peter Kamvysselis, Apparatus and methods for copying, backing up, and restoring data using a backup segment size larger than the storage block size.
Griffin David (Maynard MA) Campbell Jonathan (Acton MA) Reilly Michael (Sterling MA) Rosenbaum Richard (Pepperell MA), Arrangement with cooperating management server node and network service node.
Nakano Toshio (Odawara JPX) Nozawa Masafumi (Odawara JPX) Kurano Akira (Odawara JPX) Hisano Kiyoshi (Odawara JPX) Hoshino Masayuki (Odawara JPX), Backup control method and system in data processing system using identifiers for controlling block data transfer.
Kitajima Hiroyuki (Yokohama) Yamamoto Akira (Yokohama) Doi Takashi (Hadano) Nozawa Masafumi (Odawara JPX), Buffered peripheral system and method for backing up and retrieving data to and from backup memory device.
Ludmila Cherkasova ; Martin F. Arlitt ; Richard J. Friedrich ; Tai Jin, Caching protocol method and system based on request frequency and relative storage duration.
Cole Leo J. (Raleigh NC) Frantz Curtis J. (Durham NC) Lee Jeannette (Raleigh NC) Ordanic Zvonimir (Raleigh NC) Plank Larry K. (Rochester MN), Centralized management in a computer network.
Carpenter Kelly S. (Fremont CA) Dearing Gerard M. (San Jose CA) Nick Jeffrey M. (Fishkill NY) Strickland Jimmy P. (Saratoga CA) Swanson Michael D. (Poughkeepsie NY) Wilkinson Wendell W. (Hyde Park NY, Coherence controls for store-multiple shared data coordinated by cache directory entries in a shared electronic storage.
Senator Steven T. ; Fuller Billy J., Computer system method and apparatus providing for various versions of a file without requiring data copy or log operati.
Fecteau Jean G. (Toronto NY CAX) Gdaniec Joseph M. (Vestal NY) Hennessy James P. (Endicott NY) MacDonald John F. (Vestal NY) Osisek Damian L. (Vestal NY), Computer system which supports asynchronous commitment of data.
Dunphy William E. (Westminster CO) Halladay Steven M. (Louisville CO) Moy Michael E. (Lafayette CO) Munro Frederick G. (Broomfield CO), Data storage and protection system.
Yanai Moshe (Framingham MA) Vishlitzky Natan (Brookline MA) Alterescu Bruno (Newton MA) Castel Daniel (Framingham MA) Shklarsky Gadi (Brookline MA), Data storage system controlled remote data mirroring with respectively maintained data indices.
Fortier Richard W. (Acton MA) Mastors Robert M. (Ayer MA) Taylor Tracy M. (Upton MA) Wallace John J. (Franklin MA), Digital data processor with improved backup storage.
Kenley Gregory (Northboro MA) Ericson George (Schrewsbury MA) Fortier Richard (Acton MA) Holland Chuck (Northboro MA) Mastors Robert (Ayer MA) Pownell James (Natick MA) Taylor Tracy (Upton MA) Wallac, Digital data storage system with improved data migration.
Christenson,Nikolai Paul; Fritchie,Scott Ernest Lystig; Larson,James Stephen, Electronic mail system with methodology providing distributed message store.
Xu Yikang ; Vahalia Uresh K. ; Jiang Xiaoye ; Gupta Uday ; Tzelnic Percy, File server system using file system storage, data movers, and an exchange of meta data among data movers for file locking and direct access to shared file systems.
Lagueux, Jr., Richard A.; Stave, Joel H.; Yeaman, John B.; Stevens, Brian E.; Higgins, Robert M.; Collins, James M., Graphical user interface for configuration of a storage system.
Urevig Paul D. ; Malnati James R. ; Ethen Donald J. ; Weber Herbert L., Grouping shared resources into one or more pools and automatically re-assigning shared resources from where they are not currently needed to where they are needed.
Cane David ; Hirschman David, High performance backup via selective file saving which can perform incremental backups and exclude files and uses a cha.
Barney Rock D. ; Schwols Keith ; Nelson Ellen M., Integration of a database into file management software for protecting, tracking and retrieving data.
Martin Charles W. (Richardson TX) Reid Fredrick S. (Plano TX) Forbus Gary L. (Dallas TX) Adams Steve M. (Plano TX) Shannon C. Patrick (Garland TX) Pirpich Eric A. (Garland TX), Mass data storage and retrieval system.
Kedem Nadav,ILX, Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information .
Long Robert M., Media element library with non-overlapping subset of media elements and non-overlapping subset of media element drives accessible to first host and unaccessible to second host.
Kullick Steven E. ; Spirakis Charles S. ; Titus Diane J., Method and apparatus for transferring archival data among an arbitrarily large number of computer devices in a networked.
Eastridge Lawrence E. (Tucson AZ) Kern Robert F. (Tucson AZ) Kern Ronald M. (Tucson AZ) Mikkelsen Claus W. (Morgan Hill CA) Ratliff James M. (Tucson AZ), Method and system for automated backup copy ordering in a time zero backup copy session.
Eastridge Lawrence E. (Tucson AZ) Kern Robert F. (Tucson AZ) Micka William F. (Tucson AZ) Mikkelsen Claus W. (Morgan Hill CA) Ratliff James M. (Tucson AZ), Method and system for automated termination and resumption in a time zero backup copy process.
Walter A. Hubis ; William G. Deitz, Method and system for controlling access share storage devices in a network environment by configuring host-to-volume mapping data structures in the controller memory for granting and denying access .
Chron, Edward Gustav; Menon, Jaishankar Moothedath, Method and system for providing consistent data modification information to clients in a storage system.
Aoyama Yuki,JPX ; Takahashi Toru,JPX ; Wakayama Satoshi,JPX, Method of and an apparatus for displaying version information and configuration information and a computer-readable recording medium on which a version and configuration information display program i.
Haustein, Nils; Klein, Craig A.; Troppens, Ulf; Winarski, Daniel J., Method of and system for deduplicating backed up data in a client-server environment.
Wahlert, Brian M; Berkowitz, Brian T; van Ingen, Catharine; Rangegowda, Dharshan; Jazayeri, Mike, Method, system, and apparatus for creating saved searches and auto discovery groups for a data protection system.
Palliyil, Sudarshan; Venkateshamurthy, Shivakumara; Vijayaraghavan, Srinivas Belur; Aswathanarayana, Tejasvi, Methods, apparatus and computer programs for enhanced access to resources within a network.
Pisello Thomas (De Bary FL) Crossmier David (Casselberry FL) Ashton Paul (Oviedo FL), Network management system having virtual catalog overview of files distributively stored across network domain.
Prahlad, Anand; May, Andreas; Lunde, Norman R.; Zhou, Lixin; Kumar, Avinash; Ngo, David, Snapshot storage and management system with indexing and user interface.
Crockett Robert N. (Tucson AZ) Kern Ronald M. (Tucson AZ) Micka William F. (Tucson AZ), Software directed microcode state save for distributed storage controller.
Friend,John; Belshe,Michael; Collins,Roger; Bennett,Mike, System and method for full wireless synchronization of a data processing apparatus with a messaging system.
Mutalik Madhav ; Senie Faith M., System and method for performing file-handling operations in a digital data processing system using an operating system-independent file map.
Moulton, Gregory Hagan, System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences.
Patel, Sujal M.; Mikesell, Paul A., System and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system.
Huai ReiJane (Old Brookville NY) Daly Robert (Ronkonkoma NY) Curti Walter (Dix Hills NY) Mohan Deepak (Huntington NY) Chueh James Kuang-Ru (Bayside NY) Louie Larry (Forest Hills NY), System and parallel streaming and data stripping to back-up a network.
Stoppani ; Jr. Peter (Woodinville WA), System for allocating storage spaces based upon required and optional service attributes having assigned piorities.
Bamford Roger J. (Woodside CA) Howard Forrest W. (Berkeley CA) Kabcenell Dirk A. (Portola Valley CA) Miner Robert N. (San Francisco CA), System for database integrity with multiple logs assigned to client subsets.
Flynn Rex A. (Belmont MA) Anick Peter G. (Marlboro MA), System for reconstructing prior versions of indexes using records indicating changes between successive versions of the.
Saether Christian D. (Seattle WA) Stoppani ; Jr. Peter (Woodinville WA), System of device independent file directories using a tag between the directories and file descriptors that migrate with.
Prahlad, Anand; Schwartz, Jeremy A.; Ngo, David; Brockway, Brian; Muller, Marcus S., Systems and methods for classifying and transferring information in a storage network.
Borghetti, Stefano; Sgro', Antonio Mario; Corte, Gianluca Della; Gianfagna, Leonida, Thread based view and archive for simple mail transfer protocol (SMTP) clients devices and methods.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.