IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0950376
(2007-12-04)
|
등록번호 |
US-8140786
(2012-03-20)
|
발명자
/ 주소 |
- Bunte, Alan
- Prahlad, Anand
- Brockway, Brian
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
181 인용 특허 :
121 |
초록
▼
A system and method of creating archive copies of data sets is described. In some examples, the system creates an archive copy from an original data set. In some examples, the system creates an archive copy when creating a recovery copy for a data set. In some examples, the system creates a copy wit
A system and method of creating archive copies of data sets is described. In some examples, the system creates an archive copy from an original data set. In some examples, the system creates an archive copy when creating a recovery copy for a data set. In some examples, the system creates a copy without redundant data, and then encrypts the data set.
대표청구항
▼
1. A system for rebuilding at least a portion of a signature database that reflects contents of an archive copy of a data set, comprising: a signature component, wherein the signature component generates a substantially unique identifier for all data objects within the data set and stores the substa
1. A system for rebuilding at least a portion of a signature database that reflects contents of an archive copy of a data set, comprising: a signature component, wherein the signature component generates a substantially unique identifier for all data objects within the data set and stores the substantially unique identifiers in a signature database, wherein the substantially unique identifier for a data object reflects contents of the data object;an encryption component, wherein the encryption component encrypts at least some of the data objects of the data set;a copy component, wherein the copy component: uses the generated substantially unique identifiers to identify redundant data objects in the data set and deduplicate the redundant data objects in order to create a deduplicated archive copy of the data set that comprises the encrypted data objects; wherein the archive copy is physically stored on sequential media; andstores the archive copy as one or more data chunks stored on the sequential media, wherein each chunk is stored with header information that includes at least one substantially unique identifier; andstores information related to locations of the encrypted data objects on the sequential media in a location database separate from the signature database; anda database rebuilding component, wherein the database rebuilding component: receives an indication that the signature database is unrecoverable or unavailable;accesses header information of at least one chunk in order to determine at least one substantially unique identifier within the header information; anduses the determined at least one substantially unique identifier from the header information in order to rebuild at least part of the signature database. 2. The system of claim 1, wherein the signature component uses a SHA-512 function to generate the substantially unique identifiers. 3. The system of claim 1, wherein the signature component scrambles the signature database. 4. The system of claim 1, wherein the copy component populates the location database when the archive copy that comprises the encrypted data objects is stored to locations on the sequential media. 5. The system of claim 1, wherein the copy component indexes contents of the data objects. 6. The system of claim 1, wherein the encryption component encrypts a data object after the signature component generates the substantially unique identifier for the data object. 7. A non-transitory computer-readable medium whose contents cause a data storage system to perform a method of rebuilding a deduplication index that reflects contents of an archive of data objects, the method comprising: identifying a data object to be stored in an archive of data objects that form a data set;creating a hash value for the data object, wherein creating the hash value includes calculating a hash value that represents contents of the data object;deduplicating the data set by: comparing the hash value with other hash values for data objects already stored in the archive of data objects;when the comparison determines that the hash value for the data object is different than the other hash values: encrypting a copy of the data object, and transferring the encrypted copy of the data object and the hash value to the archive of data objects, andstoring in a file on sequential media, the transferred encrypted copy of the data object and the transferred hash value, wherein a header region of the file stores the hash value; orwhen the comparison determines that the hash value for the data object is identical to one or more of the other hash values: transferring the hash value that represents contents of the data object to the archive of data objects; andstoring in a file on sequential media, the transferred hash value, wherein a header region of the file stores the hash value;updating an entry in a deduplication index to reflect the identification of the data object, wherein the entry is updated using the hash value;upon receiving an indication that the deduplication index is unavailable or unrecoverable, accessing the hash value from the header region of a data file stored on sequential media; andusing the accessed hash value to rebuild a portion of a new, rebuilt version of the deduplication index. 8. The computer-readable medium of claim 7, wherein the data object is identified from a primary copy of a set of data objects. 9. The computer-readable medium of claim 7, wherein the data object is identified from one or more of secondary copies of a set of data objects. 10. The computer-readable medium of claim 7, wherein the data object is identified when the data storage system receives a request from a user to store a copy of the data object in the archive of data objects. 11. A method for rebuilding at least a portion of a single instancing index containing hash values that represent contents of a single instanced data set, comprising: single instancing a data set in order to create a single instanced data set organized as an archive file and physically stored on one or more magnetic tapes, wherein the single instancing further comprises: calculating substantially unique hash values that represent the data set,storing at least some of the calculated hash values that represent the data set in a single instancing index, andstoring the calculated hash values within headers of one or more data files that form part of the archive file, wherein the one or more data files are separate from the single instancing index and also store at least a subset of the data set, andwherein the one or more data files are stored on the one or more tapes;receiving an indication that at least part of the single instancing index storing hash values that represent the data set is unrecoverable or unavailable;in response to receiving the indication, identifying at least one data file that forms part of the archive file on the one or more tapes;extracting stored hash value information from a header of the identified at least one data file that forms part of the archive file; and,adding the extracted hash value information to a new, rebuilt version of the single instancing index. 12. The method of claim 11, further comprising: encrypting the one or more data files that form part of the archive file;decrypting the identified at least one data file to gain access to the stored hash value information from the header of the identified at least one data file; andre-encrypting the decrypted at least one data file. 13. The method of claim 11, further comprising: receiving a request to restore a data object; andusing the new, rebuilt version of the single instancing index to locate the data object within the archive file.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.