Data object store and server for a cloud storage environment, including data deduplication and data management across multiple cloud storage sites
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-017/30
G06Q-030/02
G06Q-050/18
H04L-029/06
G06F-003/06
H04L-029/08
G06F-011/34
출원번호
US-0258252
(2016-09-07)
등록번호
US-10248657
(2019-04-02)
발명자
/ 주소
Prahlad, Anand
Muller, Marcus S.
Kottomtharayil, Rajiv
Kavuri, Srinivas
Gokhale, Parag
Vijayan, Manoj Kumar
출원인 / 주소
Commvault Systems, Inc.
대리인 / 주소
Perkins Coie LLP
인용정보
피인용 횟수 :
0인용 특허 :
170
초록▼
Data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, are performed within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer ov
Data storage operations, including content-indexing, containerized deduplication, and policy-driven storage, are performed within a cloud environment. The systems support a variety of clients and cloud storage sites that may connect to the system in a cloud environment that requires data transfer over wide area networks, such as the Internet, which may have appreciable latency and/or packet loss, using various network protocols, including HTTP and FTP. Methods are disclosed for content indexing data stored within a cloud environment to facilitate later searching, including collaborative searching. Methods are also disclosed for performing containerized deduplication to reduce the strain on a system namespace, effectuate cost savings, etc. Methods are disclosed for identifying suitable storage locations, including suitable cloud storage sites, for data files subject to a storage policy. Further, systems and methods for providing a cloud gateway and a scalable data object store within a cloud environment are disclosed, along with other features.
대표청구항▼
1. A method for storing a secondary copy, of an original data set, on a cloud storage site using a cloud gateway, wherein the cloud gateway is coupled between multiple computers and one or more cloud storage sites via a network, the method comprising: identifying data blocks within a cache of the cl
1. A method for storing a secondary copy, of an original data set, on a cloud storage site using a cloud gateway, wherein the cloud gateway is coupled between multiple computers and one or more cloud storage sites via a network, the method comprising: identifying data blocks within a cache of the cloud gateway that satisfy certain criteria, wherein the original data set comprises data blocks, andwherein the certain criteria are from a storage policy;performing block-level deduplication of the identified data blocks to create a deduplicated set of data, wherein the block-level deduplication includes— determining a size for a container file to utilize when deduplicating the identified data blocks; anddeduplicating at least some of the identified data blocks to create one or more container files containing deduplicated data,wherein at least one of the container files has the determined size; andstoring the deduplicated set of data on the cloud storage site by: buffering data, to a data buffer, for transmission to the cloud storage site;repeating the following steps while the data buffer is not full: receiving a file system request to write a group of data to the cloud storage site; andadding the group of data to the buffer;converting a file system request to one or more application program interface calls associated with the cloud storage site; andtransmitting contents of the data buffer to the cloud storage site using the one or more application program interface calls associated with the cloud storage site. 2. The method of claim 1, further comprising identifying the cloud storage site on which to store the secondary copy of the original data set by: identifying two or more candidate cloud storage sites;accessing a storage policy having a set of preferences and storage criteria, wherein the set of preferences and storage criteria includes at least two of the following: one or more preferred cloud storage sites,one or more preferred classes or quality of cloud storage sites,requirements regarding deduplication of the original data set,requirements regarding encryption of the original data set,requirements regarding compression of the original data set,quality of a network connection available to the cloud storage site,one or more data retention periods,data characteristics of at least some data in the original data set,estimated or historic usage associated with operating one or more system components,frequency with which the original data set was accessed or modified during a particular time period,a specified level of fault tolerance, orone or more geographical locations or political states in which data storage devices for a cloud storage site exist; andselecting at least one of the two or more of the candidate cloud storage sites based at least in part on the set of preferences and storage criteria in the storage policy. 3. The method of claim 1 wherein the contents of the data buffer are transmitted to the cloud storage site using at least one of hypertext transfer protocol (HTTP) and HTTP over Transport Layer Security/Secure Sockets Layer. 4. The method of claim 1 wherein the certain criteria include time-based criteria. 5. A system for creating a secondary copy of an original data set using a cloud storage site, the system comprising a memory and processor that are configured to: identify sub-objects of the original data set that satisfy certain criteria, wherein the certain criteria are related a storage policy, andwherein the original data set is received from one or more client computers;perform deduplication of the identified data sub-objects to create a deduplicated set of data; and,forward the deduplicated set of data to the cloud storage site, wherein the forwarding includes: converting file system requests into application program interface calls associated with the cloud storage site; and,forwarding the data to the cloud storage site using the one or more application program interface calls associated with the cloud storage site. 6. The system of claim 5, wherein the memory and processor are further configured to: determine a size for a container file and for deduplicating at least some of the data sub-objects to create one or more container files containing deduplicated data, wherein at least one of the container files has the determined size. 7. The system of claim 5, wherein the forwarding further includes: buffering data, to a data buffer, for transmission to the cloud storage site by: receiving a file system request to write a group of data to the cloud storage site; andadding the group of data to the data buffer. 8. The system of claim 5, wherein the certain criteria include time-based criteria, wherein the deduplication includes block-level deduplication, and wherein the block-level deduplication includes— determining a size for a container file to utilize when deduplicating the identified data blocks; anddeduplicating at least some of the identified data blocks to create one or more container files containing deduplicated data, wherein at least one of the container files has the determined size; andwherein the container file is forwarded to the cloud storage site. 9. The system of claim 5, wherein the forwarding further includes: buffering data, to a data buffer, for transmission to the cloud storage site by repeating the following steps while the data buffer is not full: receiving a file system request to write a group of data to the cloud storage site; andadding the group of data to the buffer. 10. A computer-implemented method for copying multiple files at a cloud storage site, wherein the cloud storage site is coupled to a computer executing a file system for accessing a secondary storage computing device, the method comprising: receiving a copy operation request to copy n number of files at the cloud storage site, wherein each of the n number of files includes metadata and data, andwherein the n number of files exceeds a threshold;establishing a container size determined by one or more factorsprocessing the n number of files by— copying the metadata of each of the n number of files to a first container;copying at least a portion of the data for the n number of files into a second container, wherein the second container is separate from the first container; andupdating a data structure, wherein the data structure— tracks, for each of the n number of files, a location of the metadata for that file in the first container, andtracks, for the at least a portion of the data for the n number of files, a location of the data in the second container. 11. The computer-implemented method of claim 10 wherein the threshold is a number of files that the file system can operate on without system degradation. 12. The computer-implemented method of claim 10 wherein the threshold is related to at least of one of the factors. 13. The computer-implemented method of claim 10 wherein the factors include at least one of: a latency associated with a network connection to the cloud storage site, ora bandwidth associated with a network connection to the cloud storage site, orwhether the cloud storage site imposes a restriction on a namespace associated with the computer or the file system, orwhether the cloud storage site permits sparsification of data files, ora pricing structure associated with the cloud storage site, ora maximum specified container file size, ora minimum specified container file size. 14. The computer-implemented method of claim 10 wherein the size of at least one of the first and second containers is no greater than the established container size. 15. A tangible computer-readable storage medium whose contents cause a data storage system to perform a method of migrating data from local primary storage to secondary storage located on a remote cloud storage site, the method comprising: identifying no more than n−1 data blocks, located within the local primary storage, that satisfy a criteria, wherein the n−1 data blocks represent a portion of a data file consisting of n blocks, andwherein the n blocks contain data written by a file system associated with the local primary storage; anddetermining a size for a container file in which to store some or all of the no more than n−1 data blocks;transferring data contained by the identified no more than n−1 data blocks from the primary storage to the secondary storage located on a cloud storage site, wherein transferring data includes writing data first to a container file of the determined size; andupdating an index with information associating the transferred data with information identifying blocks within the secondary storage that contain the transferred data, wherein the information includes at least one uniform resource locator or logical address that identifies at least one logical location from which the transferred data may be accessed. 16. The tangible computer-readable storage medium of claim 15 wherein the index further comprises information associating the transferred data with information identifying tape offsets for secondary storage that contain the transferred data. 17. The tangible computer-readable storage medium of claim 15, further comprising: receiving a copy operation request to copy m number of files at the cloud storage site, wherein each of the m number of files includes metadata and data, andwherein the m number of files exceeds a size threshold. 18. The tangible computer-readable storage medium of claim 15, wherein determining the size for the container file considers: a latency associated with a network connection to the secondary storage computing device; ora bandwidth associated with a network connection to the secondary storage computing device. 19. The tangible computer-readable storage medium of claim 15, wherein determining the size for the container file considers: whether the cloud storage site imposes a restriction on a namespace associated with the computer or the file system; orwhether the cloud storage site permits sparsification of data files. 20. The tangible computer-readable storage medium of claim 15, wherein determining the size for the container file considers: a pricing structure associated with the cloud storage site;a maximum specified container file size; ora minimum specified container file size. 21. A system for storing, on a cloud storage site, a secondary copy of an original data set, the system comprising: at least one processor;memory coupled to the at least one processor, wherein the memory stores contents that, when executed by the at least one processor performs a method of: identifying a cloud storage site on which to store a secondary copy of a primary data set;updating an index of content to reflect at least some data content in the primary data set;deduplicating at least some of the data content in the primary data set;creating one or more container files containing the deduplicated data; andtransferring the one or more container files to the cloud storage site, wherein the transferring includes: converting file system requests into application program interface calls associated with the cloud storage site; and,forwarding the one or more container files to the cloud storage site using one or more application program interface calls associated with the cloud storage site. 22. The system of claim 21 wherein the transferring further includes: buffering the one or more container files, to a data buffer, for transmission to the cloud storage site by repeating the following steps while the data buffer is not full: receiving a file system request to write a group of data to the cloud storage site; andadding the group of data to the buffer. 23. The system of claim 21 wherein the memory and processor are further configured to: determine a size for a container file based on one or more factors, wherein the factors include at least one of: a latency associated with a network connection to the secondary storage computing device, ora bandwidth associated with a network connection to the secondary storage computing device, orwhether the cloud storage site permits sparsification of data files, ora pricing structure associated with the cloud storage site, ora maximum specified container file size, ora minimum specified container file size; andwherein at least one of the container files has the determined size.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (170)
Wang Lichen (150 Tennyson Ave. Palo Alto CA 94301), Apparatus and method for supplying power and wake-up signal using host port\s signal lines of opposite polarities.
Yuval Ofek ; Zoran Cakeljic ; Samuel Krikler IL; Sharon Galtzur IL; Michael Hirsch IL; Dan Arnon ; Peter Kamvysselis, Apparatus and methods for copying, backing up, and restoring data using a backup segment size larger than the storage block size.
Fung, Henry T., Apparatus, architecture, and method for integrated modular server system providing dynamically power-managed and work-load managed network devices.
Griffin David (Maynard MA) Campbell Jonathan (Acton MA) Reilly Michael (Sterling MA) Rosenbaum Richard (Pepperell MA), Arrangement with cooperating management server node and network service node.
Nakano Toshio (Odawara JPX) Nozawa Masafumi (Odawara JPX) Kurano Akira (Odawara JPX) Hisano Kiyoshi (Odawara JPX) Hoshino Masayuki (Odawara JPX), Backup control method and system in data processing system using identifiers for controlling block data transfer.
Kitajima Hiroyuki (Yokohama) Yamamoto Akira (Yokohama) Doi Takashi (Hadano) Nozawa Masafumi (Odawara JPX), Buffered peripheral system and method for backing up and retrieving data to and from backup memory device.
Cole Leo J. (Raleigh NC) Frantz Curtis J. (Durham NC) Lee Jeannette (Raleigh NC) Ordanic Zvonimir (Raleigh NC) Plank Larry K. (Rochester MN), Centralized management in a computer network.
Carpenter Kelly S. (Fremont CA) Dearing Gerard M. (San Jose CA) Nick Jeffrey M. (Fishkill NY) Strickland Jimmy P. (Saratoga CA) Swanson Michael D. (Poughkeepsie NY) Wilkinson Wendell W. (Hyde Park NY, Coherence controls for store-multiple shared data coordinated by cache directory entries in a shared electronic storage.
Fellenstein,Craig William; Hamilton, II,Rick Allen; Joseph,Joshy; Seaman,James W., Computer implemented method for automatically controlling selection of a grid provider for a grid job.
Mogi, Kazuhiko; Nishikawa, Norifumi; Idei, Hideomi, Computer system for managing performances of storage apparatus and performance management method of the computer system.
Senator Steven T. ; Fuller Billy J., Computer system method and apparatus providing for various versions of a file without requiring data copy or log operati.
Fecteau Jean G. (Toronto NY CAX) Gdaniec Joseph M. (Vestal NY) Hennessy James P. (Endicott NY) MacDonald John F. (Vestal NY) Osisek Damian L. (Vestal NY), Computer system which supports asynchronous commitment of data.
Prahlad, Anand; Muller, Marcus S.; Kottomtharayil, Rajiv; Kavuri, Srinivas; Gokhale, Parag; Vijayan, Manoj, Data object store and server for a cloud storage environment, including data deduplication and data management across multiple cloud storage sites.
Prahlad, Anand; Muller, Marcus S.; Kottomtharayil, Rajiv; Kavuri, Srinivas; Gokhale, Parag; Vijayan, Manoj, Data object store and server for a cloud storage environment, including data deduplication and data management across multiple cloud storage sites.
Prahlad, Anand; Muller, Marcus S.; Kottomtharayil, Rajiv; Kavuri, Srinivas; Gokhale, Parag; Vijayan, Manoj Kumar, Data object store and server for a cloud storage environment, including data deduplication and data management across multiple cloud storage sites.
Dunphy William E. (Westminster CO) Halladay Steven M. (Louisville CO) Moy Michael E. (Lafayette CO) Munro Frederick G. (Broomfield CO), Data storage and protection system.
Yanai Moshe (Framingham MA) Vishlitzky Natan (Brookline MA) Alterescu Bruno (Newton MA) Castel Daniel (Framingham MA) Shklarsky Gadi (Brookline MA), Data storage system controlled remote data mirroring with respectively maintained data indices.
Mega, Cataldo; Veliah, Sundar, Determining, transmitting, and receiving performance information with respect to an operation performed locally and at remote nodes.
Fortier Richard W. (Acton MA) Mastors Robert M. (Ayer MA) Taylor Tracy M. (Upton MA) Wallace John J. (Franklin MA), Digital data processor with improved backup storage.
Kenley Gregory (Northboro MA) Ericson George (Schrewsbury MA) Fortier Richard (Acton MA) Holland Chuck (Northboro MA) Mastors Robert (Ayer MA) Pownell James (Natick MA) Taylor Tracy (Upton MA) Wallac, Digital data storage system with improved data migration.
Xu Yikang ; Vahalia Uresh K. ; Jiang Xiaoye ; Gupta Uday ; Tzelnic Percy, File server system using file system storage, data movers, and an exchange of meta data among data movers for file locking and direct access to shared file systems.
Lagueux, Jr., Richard A.; Stave, Joel H.; Yeaman, John B.; Stevens, Brian E.; Higgins, Robert M.; Collins, James M., Graphical user interface for configuration of a storage system.
Urevig Paul D. ; Malnati James R. ; Ethen Donald J. ; Weber Herbert L., Grouping shared resources into one or more pools and automatically re-assigning shared resources from where they are not currently needed to where they are needed.
Prahlad,Anand; Kavuri,Srinivas; Madeira,Andre Duque; Lunde,Norman R.; Bunte,Alan G.; May,Andreas; Schwartz,Jeremy, Hierarchical systems and methods for providing a unified view of storage information.
Vijayan, Manoj Kumar; Chen, Ho-chi; Attarde, Deepak Raghunath; Joshi, Hetalkumar N., Information management of data associated with multiple cloud services.
Barney Rock D. ; Schwols Keith ; Nelson Ellen M., Integration of a database into file management software for protecting, tracking and retrieving data.
Martin Charles W. (Richardson TX) Reid Fredrick S. (Plano TX) Forbus Gary L. (Dallas TX) Adams Steve M. (Plano TX) Shannon C. Patrick (Garland TX) Pirpich Eric A. (Garland TX), Mass data storage and retrieval system.
Kedem Nadav,ILX, Mass storage subsystem and backup arrangement for digital data processing system which permits information to be backed up while host computer(s) continue(s) operating in connection with information .
Long Robert M., Media element library with non-overlapping subset of media elements and non-overlapping subset of media element drives accessible to first host and unaccessible to second host.
Kullick Steven E. ; Spirakis Charles S. ; Titus Diane J., Method and apparatus for transferring archival data among an arbitrarily large number of computer devices in a networked.
Eastridge Lawrence E. (Tucson AZ) Kern Robert F. (Tucson AZ) Kern Ronald M. (Tucson AZ) Mikkelsen Claus W. (Morgan Hill CA) Ratliff James M. (Tucson AZ), Method and system for automated backup copy ordering in a time zero backup copy session.
Eastridge Lawrence E. (Tucson AZ) Kern Robert F. (Tucson AZ) Micka William F. (Tucson AZ) Mikkelsen Claus W. (Morgan Hill CA) Ratliff James M. (Tucson AZ), Method and system for automated termination and resumption in a time zero backup copy process.
Walter A. Hubis ; William G. Deitz, Method and system for controlling access share storage devices in a network environment by configuring host-to-volume mapping data structures in the controller memory for granting and denying access .
Aoyama Yuki,JPX ; Takahashi Toru,JPX ; Wakayama Satoshi,JPX, Method of and an apparatus for displaying version information and configuration information and a computer-readable recording medium on which a version and configuration information display program i.
Crescenti,John; Kavuri,Srinivas; Oshinsky,David Alan; Prahlad,Anand, Modular backup and retrieval system used in conjunction with a storage area network.
Pisello Thomas (De Bary FL) Crossmier David (Casselberry FL) Ashton Paul (Oviedo FL), Network management system having virtual catalog overview of files distributively stored across network domain.
Prahlad, Anand; Kottomtharayil, Rajiv; Kavuri, Srinivas; Gokhale, Parag; Vijayan, Manoj, Performing data storage operations in a cloud storage environment, including searching, encryption and indexing.
Prahlad, Anand; Muller, Marcus S.; Kottomtharayil, Rajiv; Kavuri, Srinivas; Gokhale, Parag; Vijayan, Manoj, Performing data storage operations with a cloud environment, including containerized deduplication, data pruning, and data transfer.
Meijer, Henricus Johannes Maria; Bergstraesser, Thomas F.; Brumme, Christopher W.; Cheng, Lili; Gounares, Alexander G.; Larus, James R.; Mishra, Debi P.; Snyder, Jr., Ira L., Resource standardization in an off-premise environment.
Crockett Robert N. (Tucson AZ) Kern Ronald M. (Tucson AZ) Micka William F. (Tucson AZ), Software directed microcode state save for distributed storage controller.
Chidlovskii Boris,FRX ; Glance Natalie S.,FRX ; Grasso Antonietta,FRX, System and method for collaborative ranking of search results employing user and group profiles derived from document collection content analysis.
Vogl, Norbert George; Purdy, Geoffrey Hale; Flavin, Robert Alan; Feng, Yuan; Clarke, Jr., Edward Payson, System and method for dispatching and scheduling network transmissions with feedback.
Pinheiro, Eduardo; Bianchini, Ricardo; Dubnicki, Cezary, System and method for dynamically changing the power mode of storage disks based on redundancy and system load.
Kottomtharayil,Rajiv; Gokhale,Parag; Prahlad,Anand; Vijayan Retnamma,Manoj Kumar; Ngo,David; Devassy,Varghese, System and method for dynamically performing storage operations in a computer network.
Keagy, John Martin; Carr, Jeffery; Lappas, Paul, System and method for monitoring a grid of hosting resources in order to facilitate management of the hosting resources.
Mutalik Madhav ; Senie Faith M., System and method for performing file-handling operations in a digital data processing system using an operating system-independent file map.
Prahlad, Anand; Gokhale, Parag; Kottomtharayil, Rajiv; Vijayan Retnamma, Manoj K.; Attarde, Deepak R., System and method for storing redundant information.
Huai ReiJane (Old Brookville NY) Daly Robert (Ronkonkoma NY) Curti Walter (Dix Hills NY) Mohan Deepak (Huntington NY) Chueh James Kuang-Ru (Bayside NY) Louie Larry (Forest Hills NY), System and parallel streaming and data stripping to back-up a network.
Stoppani ; Jr. Peter (Woodinville WA), System for allocating storage spaces based upon required and optional service attributes having assigned piorities.
Flynn Rex A. (Belmont MA) Anick Peter G. (Marlboro MA), System for reconstructing prior versions of indexes using records indicating changes between successive versions of the.
Saether Christian D. (Seattle WA) Stoppani ; Jr. Peter (Woodinville WA), System of device independent file directories using a tag between the directories and file descriptors that migrate with.
Prahlad, Anand; Schwartz, Jeremy A.; Ngo, David; Brockway, Brian; Muller, Marcus S., Systems and methods for classifying and transferring information in a storage network.
Prahlad,Anand; Kavuri,Srinivas; Madeira,Andre Duque; Lunde,Norman R.; Bunte,Alan G.; May,Andreas; Schwartz,Jeremy, Systems and methods for generating a storage-related metric.
Prahlad, Anand; Kavuri, Srinivas; Madeira, Andre Duque; Lunde, Norman R.; Bunte, Alan G.; May, Andreas; Schwartz, Jeremy, Systems and methods for storage modeling and costing.
Prahlad,Anand; Kavuri,Srinivas; Madeira,Andre Duque; Lunde,Norman R.; Bunte,Alan G; May,Andreas; Schwartz,Jeremy, Systems and methods for storage modeling and costing.
Mensch ; Jr. William D. (1924 E. Hope St. Mesa AZ 85203), Topography of CMOS microcomputer integrated circuit chip including core processor and memory, priority, and I/O interfac.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.