A content addressable storage array element (CASAE) of a storage system is configured to eliminate duplicate data stored on its storage resources. The CASAE independently determines whether data associated with a write operation has already been written to a location on its storage resources. To tha
A content addressable storage array element (CASAE) of a storage system is configured to eliminate duplicate data stored on its storage resources. The CASAE independently determines whether data associated with a write operation has already been written to a location on its storage resources. To that end, the CASAE performs a content addressable storage computation on each data block written to those resources in order to prevent storage of two or more blocks with the same data. If data of a block has been previously stored on the resources, the CASAE cooperates with a file system executing on the system to provide a reference (block pointer) to the same data block rather than duplicate the stored data. Otherwise, the CASAE stores the data block at a new location on the resources and provides a block pointer to that location.
대표청구항▼
What is claimed is: 1. A method for managing storage resources of a storage system, the method comprising: performing, on a remote storage array at the logical unit level, a content addressable storage computation to compute a key from content of a first data block in response to receiving a client
What is claimed is: 1. A method for managing storage resources of a storage system, the method comprising: performing, on a remote storage array at the logical unit level, a content addressable storage computation to compute a key from content of a first data block in response to receiving a client request to write the first data block to the storage system; comparing, on the remote storage array, the computed key with keys of entries in a mapping table to determine if there is a match; in response to determining there is a match, comparing, on the remote storage array, the content of the first data block with content of a second data block previously stored on the resources of the remote storage array; and in response to determining that the comparison of the data block contents results in a match, incrementing a reference count on the previously stored data block, cooperating with a file system executing on the storage system to provide the storage system with a physical block number of the second data block to the storage system rather than storing duplicate data block contents on the storage resources of the remote storage array; and wherein the remote storage array operates in parallel with one or more additional remote storage arrays to allow aggregation of resources among the remote storage arrays, the parallel operation performing content addressable storage computations associated with the write operations on each of the one or more remote storage arrays. 2. The method of claim 1, further comprising: if one of the keys do not match and the data block contents do not match: writing the first data block to a next available block location on the storage resources; and providing a block number of the next available location to the storage system. 3. The method of claim 1, further comprising: computing the key using a hash function. 4. A computer system configured for managing storage resources of a storage system, comprising: a remote content addressable storage array element configured to compute, on a remote storage array at the logical unit level by a processor, a key based on content of a first data block in response to receiving a client request to write the first data block to the storage device, determine whether the key has been generated for a second data block previously stored on the resources, if so compare, on the remote storage array, the data block contents of the remote storage array, if there is a match increment a reference count on the previously stored data block and cooperate with a file system executing on the storage system to provide the storage system with a physical block number of the second data block to the storage system rather than writing the first data block to the storage device; and wherein the remote storage array operates in parallel with one or more additional remote storage arrays to allow aggregation of resources among the remote storage arrays, the parallel operation performing content addressable storage computations associated with the write operations on each of the one or more remote storage arrays. 5. The system of claim 4, further comprising: the physical block number is a physical block number of a logical unit stored on the storage resources. 6. The system of claim 4, further comprising: if one of the key has not been generated for the second data block previously stored on the resources, the content addressable storage array element is further configured to write the first data block to a next available block location on the storage resources and return a block number of the next available location to a storage system coupled to the element. 7. The system of claim 4, further comprising: the key is computed using a hash function. 8. The system of claim 7, further comprising: the hash function is a MD5 hashing algorithm. 9. The system of claim 4, further comprising: a mapping data structure used by the content addressable storage array element to determine whether the key has been generated for the second data block. 10. The system of claim 9, further comprising: the mapping data structure is a table having a plurality of entries, each entry containing a hash key, a pointer configured to reference a data block previously stored on the storage resources and a reference count. 11. The system of claim 4, further comprising: the content addressable storage array element is further configured to prevent storage of duplicate data block contents on the storage resources. 12. An apparatus configured to manage storage resources of a storage system, the apparatus comprising: performing, by a processor, at the logical unit level, a content addressable storage computation to compute a key from content of a first data block in response to receiving a client request to write the first data block to the storage system; means for comparing the computed key with keys of entries in a mapping table to determine if there is a match; in response to determining that there is a match, means for comparing the content of the first data block with content of a second data block previously stored on the resources of a remote storage array; in response to determining the comparison of the data block contents results in a match, means for incrementing a reference count on the previously stored data block, and means for cooperating with a file system executing on the storage system to provide the storage system with a physical block number of the second data block to the storage system rather than storing duplicate data block contents on the storage resources of the remote storage array; and wherein the remote storage array operates in parallel with one or more additional remote storage arrays to allow aggregation of resources among the remote storage arrays, the parallel operation performing content addressable storage computations associated with the write operations on each of the one or more remote storage arrays. 13. The apparatus of claim 12, further comprising: if one of the keys do not match and the data block contents do not match: means for writing the first data block to a next available block location on the storage resources; and means for providing a block number of the next available location to the storage system. 14. The apparatus of claim 12, further comprising: means for computing the key using a hash function. 15. A computer readable medium containing executable program instructions executed by a processor, comprising: program instructions that perform, on a remote storage array at the logical unit level, a content addressable storage computation to compute a key from content of a first data block in response to receiving a client request to write the first data block to the storage system; program instructions that compare, on the remote storage array, the computed key with keys of entries in a mapping table to determine if there is a match; program instructions that compare, on a remote storage array, the content of the first data block with content of a second data block previously stored on the resources in response to determining that there is a match, comparing; program instructions that, in response to determining that the comparison of the data block contents results in a match, increment a reference count on the previously stored data block and cooperate with a file system executing on the storage system to provide the storage system with a physical block number of the second data block to the storage system rather than storing duplicate data block contents on the storage resources of the remote storage array; and wherein the remote storage array operates in parallel with one or more additional remote storage arrays to allow aggregation of resources among the remote storage arrays, the parallel operation performing content addressable storage computations associated with the write operations on each of the one or more remote storage arrays. 16. The computer readable medium of claim 15, further comprising: if one of the keys do not match and the data block contents do not match, one or more program instructions for: writing the first data block to a next available block location on the storage resources; and providing a block number of the next available location to the storage system. 17. The computer readable medium of claim 15, further comprising: computing the key using a hash function. 18. A method for managing storage resources of a storage system, the method comprising: receiving from a storage system a write request at a content addressable storage array element (CASAE), the CASAE coupled to a plurality of disks of a remote storage array, the remote storage array configured to store user data of a data container served by the storage system; performing, on the remote storage array at the logical unit level, a content addressable storage computation, the computation resulting in a key computed from content of a first data block; comparing, on the remote storage array, the computed key with a plurality of previously generated keys to determine if there is a match, the previously generated keys associated with previously stored data blocks; in response to determining that there is a match, comparing the content of the first data block with content of a second data block previously stored on the remote storage array and in response to determining that the comparison of the data block contents results in a match, incrementing a reference count on the previously stored data block, and cooperating with a file system executing on the storage system to provide the storage system with a physical block number of the second data block to the storage system rather than storing duplicate data block contents on the remote storage array; and wherein the remote storage array operates in parallel with one or more additional remote storage arrays to allow aggregation of resources among the remote storage arrays, the parallel operation performing content addressable storage computations associated with the write operations on each of the one or more remote storage arrays. 19. The method of claim 18, further comprising: if one of the previously generated keys do not match and the data block contents do not match: writing the first data block to a next available block location on the remote storage array; and providing a block number of the next available location to the storage system. 20. The method of claim 18, further comprising: computing the key using a hash function. 21. A method for managing a storage system, comprising: receiving a write request to write a first data block to a remote storage array; computing, on the remote storage array at the logical unit level, a hash key of the first data block; comparing, on the remote storage array, the hash key of the first data block with previously computed hash keys of stored data blocks, the stored data blocks stored in the remote storage array; in the event that the hash key of the first data block does not match any of the previously computed hash keys, storing the first data block to the remote storage array; in the event that the hash key of the first data block does match a previously computed hash key; comparing, on the remote storage array, the first data block with one or more stored data blocks associated with the previously computed hash key; in the event that the first data block matches one of the one or more data blocks associated with the previously computed hash key, cooperating with a file system executing on the storage system to provide the storage system with a physical block number of a stored data block associated with the previously computed hash key to the storage system; updating a pointer to a location of the stored data block; in the event that the first data block does not match the one or more stored data blocks associated with the previously computed hash key, storing the first data block to the storage array; and wherein the remote storage array operates in parallel with one or more additional remote storage arrays to allow aggregation of resources among the remote storage arrays, the parallel operation performing content addressable storage computations associated with the write operations on each of the one or more remote storage arrays. 22. A system for managing a storage system, comprising: a write request to write a first data block to a remote storage array; a processor on a content addressable storage array element, to compute, at the logical unit level, a hash key of the first data block; the processor to compare the hash key of the first data block with previously computed hash keys of stored data blocks, the stored data blocks stored in the remote storage array; in the event that the hash key of the first data block does not match any of the previously computed hash keys, the processor to store the first data block to the remote storage array; in the event that the hash key of the first data block does match a previously computed hash key; the processor to compare, on the remote storage array, the first data block with one or more stored data blocks associated with the previously computed hash key; in the event that the first data block matches one of the one or more data blocks associated with the previously computed hash key, the processor to cooperate with a file system executing on the storage system to provide the storage system with a physical block number of a stored data block associated with the previously computed hash key to the storage system; the processor to update a pointer to a location of the stored data block; in the event that the first data block does not match the one or more stored data blocks associated with the previously computed hash key, the processor to store the first data block to the remote storage array; and wherein the remote storage array operates in parallel with one or more additional remote storage arrays to allow aggregation of resources among the remote storage arrays, the parallel operation performing content addressable storage computations associated with the write operations on each of the one or more remote storage arrays. 23. A computer readable medium containing executable program instructions executed by a processor, comprising: program instructions that receive a write request to write a first data block to a remote storage array; program instructions that compute, on the remote storage array at the logical unit level, a hash key of the first data block; program instructions that compare, on the remote storage array, the hash key of the first data block with previously computed hash keys of stored data blocks, the stored data blocks stored in the storage array; program instructions that, in the event that the hash key of the first data block does not match any of the previously computed hash keys, store the first data block to the remote storage array; program instructions that, in the event that the hash key of the first data block does match a previously computed hash key; compare, on the remote storage array, the first data block with one or more stored data blocks associated with the previously computed hash key; in the event that the first data block matches one of the one or more data blocks associated with the previously computed hash key, cooperate with a file system executing on the storage system to provide the storage system with a physical block number of a stored data block associated with the previously computed hash key to the storage system; update a pointer to a location of the stored data block; in the event that the first data block does not match the one or more stored data blocks associated with the previously computed hash key, store the first data block to the remote storage array; and wherein the remote storage array operates in parallel with one or more additional remote storage arrays to allow aggregation of resources among the remote storage arrays, the parallel operation performing content addressable storage computations associated with the write operations on each of the one or more remote storage arrays.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (30)
Jacobs Michael N. (Rochester MN) Lewis David O. (Rochester MN) Thomforde Dale J. (Pine Island MN), Apparatus for storing modifying data prior to selectively storing data to be modified into a register.
Swenson Robert E. (Mendota Heights MN) Hanson Merlin L. (Arden Hills MN) Kelson Larry J. (Shoreview MN), Cache/disk subsystem with file number for recovery of cached data.
Oxley Donald W. (Carrollton TX) McEntee Timothy J. (Dallas TX) Thatte Satish M. (Richardson TX), Computer memory system with parallel garbage collection independent from an associated user processor.
Belsan Jay S. (Nederland CO) Rudeseal George A. (Boulder CO) Milligan Charles A. (Golden CO), Dynamically mapped data storage subsystem having multiple open destage cylinders and method of managing that subsystem.
Milligan Charles A. (Golden CO) Rudeseal George A. (Boulder CO), Logical track write scheduling system for a parallel disk drive array data storage subsystem.
Potter David (Acton MA) Provost Laurence N. (Arlington MA) Baron John M. (Grafton MA) Stefanovic David (Allston MA) Sharakan Eric D. (Brighton MA) Sheppard David A. (Cambridge MA) Isman Marshall A. (, Method and apparatus for operating multi-unit array of memories.
Weinreb Daniel L. (Arlington MA) Haradhvala Sam J. (Weston MA), Method and apparatus for virtual memory mapping and transaction management in an object-oriented database system.
Hitz David ; Malcolm Michael ; Lau James ; Rakitzis Byron, Method for maintaining consistent states of a file system and for creating user-accessible read-only copies of a file s.
Gentry Timothy W. (Wichita KS) Fredin Gerald J. (Wichita KS) Riedl Daniel A. (Andover KS), Method for partitioning disk drives within a physical disk array and selectively assigning disk drive partitions into a.
Lowry Edward S. (Acton MA) Van Horn Earl C. (Concord MA) Nixon David M. (Bolton MA), Method of integrating software application programs using an attributive data model database.
Clark Brian E. (Rochester MN) Lawlor Francis D. (Saugerties NY) Schmidt-Stumpf Werner E. (Patterson NY) Stewart Terrence J. (Rochester MN) Timms ; Jr. George D. (Rochester MN), Parity spreading to enhance storage access.
Bean Robert G. (Colorado Springs CO) Beckman Michael E. (Colorado Springs CO) Rubinson Barry L. (Colorado Springs CO) Gardner Edward A. (Colorado Springs CO) Sergeant O. Winston (Colorado Springs CO), Secondary storage facility empolying serial communications between drive and controller.
Osmond, Roger F.; Goren, Gil, Achieving strong cryptographic correlation between higher level semantic units and lower level components in a secure data storage system.
Baker, Don; Carpentier, Paul R. M.; Klager, Andrew; Pierce, Aaron; Ring, Jonathan; Turpin, Russell; Yoakley, David, Erasure coding and replication in storage clusters.
Baker, Don; Carpentier, Paul R. M.; Klager, Andrew; Pierce, Aaron; Ring, Jonathan; Turpin, Russell; Yoakley, David, Erasure coding and replication in storage clusters.
Baker, Don; Carpentier, Paul R. M.; Klager, Andrew; Pierce, Aaron; Ring, Jonathan; Turpin, Russell; Yoakley, David, Erasure coding and replication in storage clusters.
Lacapra, Francesco; Duvvuri, Srinivas P.; Miloushev, Vladimir I.; Nikolova, legal representative, Krasimira; Nickolov, Peter A., File aggregation in a switched file system.
Cai, Hao; Michels, Timothy S.; Szabo, Paul I., Hardware assisted flow acceleration and L2 SMAC management in a heterogeneous distributed multi-tenant virtualized clustered system.
Botelho, Fabiano C.; Garg, Nitin; Shilane, Philip N.; Wallace, Grant, Memory efficient sanitization of a deduplicated storage system using a perfect hash function.
Anderson, Robert J.; Fachan, Neal T.; Husted, Justin M.; Lemar, Eric M.; Passey, Aaron J.; Schack, Darren P., Systems and methods for a snapshot of data.
Anderson, Robert J.; Fachan, Neal T.; Lemar, Eric M.; Passey, Aaron J.; Richards, David W.; Schack, Darren P., Systems and methods for a snapshot of data.
Passey, Aaron J.; Schack, Darren P.; Godman, Peter J.; Anderson, Robert J.; Fachan, Neal T., Systems and methods for accessing and updating distributed data.
Anderson, Robert J.; Fachan, Neal T.; Godman, Peter J.; Husted, Justin M.; Passey, Aaron J.; Richards, David W.; Schack, Darren P., Systems and methods for managing unavailable storage devices.
Patel, Sujal M.; Mikesell, Paul A.; Schack, Darren P., Systems and methods for providing a distributed file system utilizing metadata to track information about data stored throughout the system.
Akidau, Tyler Arthur; Dire, Nate E.; Fachan, Neal T.; Godman, Peter J.; Loafman, Zachary M., Systems and methods of managing resource utilization on a threaded computer system.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.