A distributed, deduplicated storage system according to certain embodiments is arranged in a parallel configuration including multiple deduplication nodes. Deduplicated data is distributed across the deduplication nodes. The deduplication nodes can be networked together and communicate with one anot
A distributed, deduplicated storage system according to certain embodiments is arranged in a parallel configuration including multiple deduplication nodes. Deduplicated data is distributed across the deduplication nodes. The deduplication nodes can be networked together and communicate with one another according using a light-weight, customized communication scheme (e.g., a scheme based on FTP or HTTP). In some cases, deduplication management information including deduplication signatures and/or other metadata is stored separately from the deduplicated data in deduplication management nodes, improving performance and scalability.
대표청구항▼
1. A method of performing a storage operation in a distributed, deduplicated storage system, comprising: creating with a first deduplication node of a plurality of deduplication nodes, a first hash signature of a first data block of a plurality of data blocks associated with a file, a first header t
1. A method of performing a storage operation in a distributed, deduplicated storage system, comprising: creating with a first deduplication node of a plurality of deduplication nodes, a first hash signature of a first data block of a plurality of data blocks associated with a file, a first header that at least identifies a first media agent that stored a copy of the first data block in a first storage device, and a first link to at least a location of the copy of the first data block in the first storage device;creating with a second deduplication node, a second hash signature of at least a second data block associated with the file, a second header that at least identifies a second media agent that stored a copy of the second data block in a second storage device, and a second link to at least a location of the second data block in the second storage device;sending from the second deduplication node to the first deduplication node, a copy of the second hash signature, a copy of the second header, and a copy of the second link;receiving a first request from a client computing device to restore the file comprising the plurality of data blocks;in response to the first request and using computer hardware, determining with the first deduplication node that the copy of the first data block of the plurality of data blocks in the requested file is stored at the first storage device;accessing with the first media agent the first data block in the first storage device;further determining with the first deduplication node that the copy of the second data block is stored on the second storage device based at least in part on accessing the copy of the second hash signature, the copy of the second header, and the copy of the second link stored in association with the first deduplication node;sending a second request from the first media agent to the second media agent via a lightweight network that requests the second data block from the second media agent, wherein the second request comprises at least the copy of the second header, and the copy of second link; andaccessing with the second media agent, the second data block from the second storage device based at least in part on the copy of the second header and the copy of the second link in the second request. 2. The method of claim 1, wherein the second media agent transmits a copy of the particular data block in response to said request for the second data block, and wherein the method further comprises receiving the transmitted copy of the second data block from the second media agent. 3. The method of claim 1, wherein the first and second media agents communicate hash signatures, headers and links without using network shares. 4. The method of claim 3, wherein the first and second media agents communicate hash signatures, headers and links between one another without having a shared static mount path configuration. 5. The method of claim 1, wherein sending the second request for the second data block is performed using a file-transfer protocol (FTP)-based service routine. 6. The method of claim 1, wherein sending the second request for the second data block is performed using a hyper-text transfer protocol (HTTP)-based service routine. 7. The method of claim 1, further comprising, before receiving the request to restore the file: performing a copy operation in which a file including a plurality of data blocks is copied in a deduplicated fashion,wherein the copied file comprises a plurality of data blocks, at least one of which is the second data block,further wherein the second link corresponding to the second data block was received by the first deduplication node during the copy operation and from one of at least one deduplication management node that is separate from the plurality of deduplication nodes and that stores deduplication management information. 8. The method of claim 7, wherein there are a plurality of deduplication management nodes performing the file copy operation comprises, for the second data block in the file: determining which of the deduplication management nodes to inquire of as to the presence of the second data block;consulting the determined deduplication management node as to whether the particular data block is already stored in the first or second storage devices;if the second data block is already stored on a deduplication node, receiving the second link to the second data block from the determined deduplication management node. 9. The method of claim 8, further comprising determining which of the plurality of deduplication management nodes to inquire of comprises performing a modulo operation on the second hash signature of the second data block. 10. A distributed deduplicated storage system, comprising: a plurality of deduplication nodes each comprising one or more processors and storage, the deduplication nodes in communication with one another via a network and a plurality of data blocks corresponding to a plurality of deduplicated files distributed across the deduplication nodes, a first deduplication node of the plurality of deduplication nodes creates a first hash signature of a first data block of the plurality of data blocks associated with a file, a first header that at least identifies a first media agent that stored a copy of the first data block in a first storage device, and a first link to at least a location of the copy of the first data block in the first storage device anda second deduplication node of the plurality of deduplication nodes creates a second hash signature of at least a second data block associated with the file, a second header that at least identifies a second media agent that stored a copy of the second data block in a second storage device, and a second link to at least a location of the second data block in the second storage device,wherein the second deduplication node sends a copy of second hash signature, the second header, and the second link to the first deduplication node;computer hardware configured to: receive a request for the file comprised of a plurality of data blocks;in response to the request, determine with the first deduplication node that the copy of the first data block of the plurality of data blocks exists at the first storage device;access with the first media agent the particular data block from the first storage device based at least in part on the first header and the first link stored in association with the first deduplication node;determine with the first deduplication node that the copy of the second data block exists at the second storage device based at least in part on the copy of the second hash signature, the copy of the second header, and the copy of the second link stored in association with the first deduplication node; andsending a second request from the first media agent to the second media agent via a lightweight network to obtain the second data block from the second storage device, the second media agent accesses the second data block from the second storage device based at least in part on the copy of the second header and the copy of the second link in the second request. 11. The distributed deduplicated storage system of claim 10, wherein the second media agent server transmits a copy of the second data block to the first media agent in response to the second request for the second data block. 12. The distributed deduplicated storage system of claim 10, wherein the first and second media agents communicate without using network shares. 13. The distributed deduplicated storage system of claim 10, wherein the first and second media agents communicate without having a shared static mount path configuration. 14. The distributed deduplicated storage system of claim 10, wherein the second media agent is configured to respond to the second request using a file-transfer protocol (FTP)-based service routine. 15. The distributed deduplicated storage system of claim 10, wherein the media agent is configured to perform the request using a hyper-text transfer protocol (HTTP)-based service routine. 16. The distributed deduplicated storage system of claim 10, further comprising determining which of the plurality of deduplication nodes to inquire of comprises performing a modulo operation on the second hash signature of the second data block. 17. The distributed deduplicated storage of claim 10 wherein the second deduplication node uses the second hash signature to locate a storage location of the second data block. 18. The distributed deduplicated storage of claim 10 wherein the first deduplication node further provides block offset information associated with the second data block to the second media agent.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (137)
Ranade, Dilip Madhusudan; Shelat, Radha; Kabra, Navin, Adaptive caching for a distributed file sharing system.
Yuval Ofek ; Zoran Cakeljic ; Samuel Krikler IL; Sharon Galtzur IL; Michael Hirsch IL; Dan Arnon ; Peter Kamvysselis, Apparatus and methods for copying, backing up, and restoring data using a backup segment size larger than the storage block size.
Griffin David (Maynard MA) Campbell Jonathan (Acton MA) Reilly Michael (Sterling MA) Rosenbaum Richard (Pepperell MA), Arrangement with cooperating management server node and network service node.
Nakano Toshio (Odawara JPX) Nozawa Masafumi (Odawara JPX) Kurano Akira (Odawara JPX) Hisano Kiyoshi (Odawara JPX) Hoshino Masayuki (Odawara JPX), Backup control method and system in data processing system using identifiers for controlling block data transfer.
Kitajima Hiroyuki (Yokohama) Yamamoto Akira (Yokohama) Doi Takashi (Hadano) Nozawa Masafumi (Odawara JPX), Buffered peripheral system and method for backing up and retrieving data to and from backup memory device.
Ludmila Cherkasova ; Martin F. Arlitt ; Richard J. Friedrich ; Tai Jin, Caching protocol method and system based on request frequency and relative storage duration.
Cole Leo J. (Raleigh NC) Frantz Curtis J. (Durham NC) Lee Jeannette (Raleigh NC) Ordanic Zvonimir (Raleigh NC) Plank Larry K. (Rochester MN), Centralized management in a computer network.
Carpenter Kelly S. (Fremont CA) Dearing Gerard M. (San Jose CA) Nick Jeffrey M. (Fishkill NY) Strickland Jimmy P. (Saratoga CA) Swanson Michael D. (Poughkeepsie NY) Wilkinson Wendell W. (Hyde Park NY, Coherence controls for store-multiple shared data coordinated by cache directory entries in a shared electronic storage.
Senator Steven T. ; Fuller Billy J., Computer system method and apparatus providing for various versions of a file without requiring data copy or log operati.
Fecteau Jean G. (Toronto NY CAX) Gdaniec Joseph M. (Vestal NY) Hennessy James P. (Endicott NY) MacDonald John F. (Vestal NY) Osisek Damian L. (Vestal NY), Computer system which supports asynchronous commitment of data.
Dunphy William E. (Westminster CO) Halladay Steven M. (Louisville CO) Moy Michael E. (Lafayette CO) Munro Frederick G. (Broomfield CO), Data storage and protection system.
Yanai Moshe (Framingham MA) Vishlitzky Natan (Brookline MA) Alterescu Bruno (Newton MA) Castel Daniel (Framingham MA) Shklarsky Gadi (Brookline MA), Data storage system controlled remote data mirroring with respectively maintained data indices.
Fortier Richard W. (Acton MA) Mastors Robert M. (Ayer MA) Taylor Tracy M. (Upton MA) Wallace John J. (Franklin MA), Digital data processor with improved backup storage.
Kenley Gregory (Northboro MA) Ericson George (Schrewsbury MA) Fortier Richard (Acton MA) Holland Chuck (Northboro MA) Mastors Robert (Ayer MA) Pownell James (Natick MA) Taylor Tracy (Upton MA) Wallac, Digital data storage system with improved data migration.
Christenson,Nikolai Paul; Fritchie,Scott Ernest Lystig; Larson,James Stephen, Electronic mail system with methodology providing distributed message store.
Xu Yikang ; Vahalia Uresh K. ; Jiang Xiaoye ; Gupta Uday ; Tzelnic Percy, File server system using file system storage, data movers, and an exchange of meta data among data movers for file locking and direct access to shared file systems.
Lagueux, Jr., Richard A.; Stave, Joel H.; Yeaman, John B.; Stevens, Brian E.; Higgins, Robert M.; Collins, James M., Graphical user interface for configuration of a storage system.
Urevig Paul D. ; Malnati James R. ; Ethen Donald J. ; Weber Herbert L., Grouping shared resources into one or more pools and automatically re-assigning shared resources from where they are not currently needed to where they are needed.
Cane David ; Hirschman David, High performance backup via selective file saving which can perform incremental backups and exclude files and uses a cha.
Barney Rock D. ; Schwols Keith ; Nelson Ellen M., Integration of a database into file management software for protecting, tracking and retrieving data.
Martin Charles W. (Richardson TX) Reid Fredrick S. (Plano TX) Forbus Gary L. (Dallas TX) Adams Steve M. (Plano TX) Shannon C. Patrick (Garland TX) Pirpich Eric A. (Garland TX), Mass data storage and retrieval system.
Kedem Nadav,ILX, Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information .
Long Robert M., Media element library with non-overlapping subset of media elements and non-overlapping subset of media element drives accessible to first host and unaccessible to second host.
Kullick Steven E. ; Spirakis Charles S. ; Titus Diane J., Method and apparatus for transferring archival data among an arbitrarily large number of computer devices in a networked.
Eastridge Lawrence E. (Tucson AZ) Kern Robert F. (Tucson AZ) Kern Ronald M. (Tucson AZ) Mikkelsen Claus W. (Morgan Hill CA) Ratliff James M. (Tucson AZ), Method and system for automated backup copy ordering in a time zero backup copy session.
Eastridge Lawrence E. (Tucson AZ) Kern Robert F. (Tucson AZ) Micka William F. (Tucson AZ) Mikkelsen Claus W. (Morgan Hill CA) Ratliff James M. (Tucson AZ), Method and system for automated termination and resumption in a time zero backup copy process.
Walter A. Hubis ; William G. Deitz, Method and system for controlling access share storage devices in a network environment by configuring host-to-volume mapping data structures in the controller memory for granting and denying access .
Chron, Edward Gustav; Menon, Jaishankar Moothedath, Method and system for providing consistent data modification information to clients in a storage system.
Aoyama Yuki,JPX ; Takahashi Toru,JPX ; Wakayama Satoshi,JPX, Method of and an apparatus for displaying version information and configuration information and a computer-readable recording medium on which a version and configuration information display program i.
Haustein, Nils; Klein, Craig A.; Troppens, Ulf; Winarski, Daniel J., Method of and system for deduplicating backed up data in a client-server environment.
Wahlert, Brian M; Berkowitz, Brian T; van Ingen, Catharine; Rangegowda, Dharshan; Jazayeri, Mike, Method, system, and apparatus for creating saved searches and auto discovery groups for a data protection system.
Palliyil, Sudarshan; Venkateshamurthy, Shivakumara; Vijayaraghavan, Srinivas Belur; Aswathanarayana, Tejasvi, Methods, apparatus and computer programs for enhanced access to resources within a network.
Pisello Thomas (De Bary FL) Crossmier David (Casselberry FL) Ashton Paul (Oviedo FL), Network management system having virtual catalog overview of files distributively stored across network domain.
Prahlad, Anand; May, Andreas; Lunde, Norman R.; Zhou, Lixin; Kumar, Avinash; Ngo, David, Snapshot storage and management system with indexing and user interface.
Crockett Robert N. (Tucson AZ) Kern Ronald M. (Tucson AZ) Micka William F. (Tucson AZ), Software directed microcode state save for distributed storage controller.
Mutalik Madhav ; Senie Faith M., System and method for performing file-handling operations in a digital data processing system using an operating system-independent file map.
Moulton, Gregory Hagan, System and method for unorchestrated determination of data sequences using sticky byte factoring to determine breakpoints in digital sequences.
Huai ReiJane (Old Brookville NY) Daly Robert (Ronkonkoma NY) Curti Walter (Dix Hills NY) Mohan Deepak (Huntington NY) Chueh James Kuang-Ru (Bayside NY) Louie Larry (Forest Hills NY), System and parallel streaming and data stripping to back-up a network.
Stoppani ; Jr. Peter (Woodinville WA), System for allocating storage spaces based upon required and optional service attributes having assigned piorities.
Bamford Roger J. (Woodside CA) Howard Forrest W. (Berkeley CA) Kabcenell Dirk A. (Portola Valley CA) Miner Robert N. (San Francisco CA), System for database integrity with multiple logs assigned to client subsets.
Flynn Rex A. (Belmont MA) Anick Peter G. (Marlboro MA), System for reconstructing prior versions of indexes using records indicating changes between successive versions of the.
Saether Christian D. (Seattle WA) Stoppani ; Jr. Peter (Woodinville WA), System of device independent file directories using a tag between the directories and file descriptors that migrate with.
Prahlad, Anand; Schwartz, Jeremy A.; Ngo, David; Brockway, Brian; Muller, Marcus S., Systems and methods for classifying and transferring information in a storage network.
Borghetti, Stefano; Sgro', Antonio Mario; Corte, Gianluca Della; Gianfagna, Leonida, Thread based view and archive for simple mail transfer protocol (SMTP) clients devices and methods.
Chambliss, David D.; Fischer-Toubol, Jonathan; Glider, Joseph S.; Harnik, Danny; Khaitzin, Ety; Kuttner, Yifat; Moser, Michael; Shatsky, Yosef, Direct lookup for identifying duplicate data in a data deduplication system.
Dornemann, Henry Wallace; Nagrale, Ajay Venkat; Pawar, Rahul S.; Venkatesha, Ananda, Live synchronization and management of virtual machines across computing and virtualization platforms and using live synchronization to support disaster recovery.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.