IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0724292
(2010-03-15)
|
등록번호 |
US-8484162
(2013-07-09)
|
발명자
/ 주소 |
- Prahlad, Anand
- Vijayan, Manoj Kumar
- Kottomtharayil, Rajiv
- Gokhale, Parag
|
출원인 / 주소 |
|
대리인 / 주소 |
Knobbe, Martens, Olson & Bear LLP
|
인용정보 |
피인용 횟수 :
36 인용 특허 :
100 |
초록
▼
Content-aware systems and methods for improving de-duplication, or single instancing, in storage operations. In certain examples, backup agents on client devices parse application-specific data to identify data objects that are candidates for de-duplication. The backup agents can then insert markers
Content-aware systems and methods for improving de-duplication, or single instancing, in storage operations. In certain examples, backup agents on client devices parse application-specific data to identify data objects that are candidates for de-duplication. The backup agents can then insert markers or other indictors in the data that identify the location(s) of the particular data objects. Such markers can, in turn, assist a de-duplication manager to perform object-based de-duplication and increase the likelihood that like blocks within the data are identified and single instanced. In other examples, the agents can further determine if a data object of one file type can or should be single-instanced with a data object of a different file type. Such processing of data on the client side can provide for more efficient storage and back-end processing.
대표청구항
▼
1. A system for managing application-generated data objects, the system comprising: a processor;a first de-duplication database associated with first application-specific data;a second de-duplication database associated with second application-specific data;a first backup agent executing in one or m
1. A system for managing application-generated data objects, the system comprising: a processor;a first de-duplication database associated with first application-specific data;a second de-duplication database associated with second application-specific data;a first backup agent executing in one or more computer processors on a first client device, the first backup agent being configured to, in response to a storage operation request:prior to performing block-level de-duplication, parse first and second application-specific data of the first client device that is the subject of the storage operation request, the first and second application-specific data comprising a plurality of first and second data objects having first and second formats; andprior to performing block-level de-duplication, insert de-duplication indicators in the first and second application-specific data, wherein the inserted de-duplication indicators identify portions within the first and second data objects where de-duplication should start and stop, and wherein the inserted de-duplication indicators further identify which of the first and second de-duplication databases to use in de-duplicating the first and second application-specific data; anda de-duplication module executing on one or more computer processors and that is configured to perform block-level de-duplication, the de-duplication module being in communication with the first backup agent to receive the first application-specific data and to:insert the de-duplication indicators by setting or clearing a bit in at least one header of the first application-specific data, wherein the at least one de-duplication indicator comprises an offset value identifying a beginning of the first data objects within the first application-specific data;use de-duplication indicators to identify where the de-duplication module should start and stop de-duplication of blocks in identified portions of the first and second application-specific data;based on said inserted de-duplication indicators, determine if a duplicate copy of any of the blocks in the identified portions of the first application-specific data exist in the first de-duplication database; andbased on said inserted de-duplication indicators, determine if a duplicate copy of any of the blocks in the identified portions of the second application-specific data exists in the second de-duplication database. 2. The system of claim 1, further comprising a second backup agent executing on a second client device, the second backup agent being configured to, in response to a second storage operation request: parse second application-specific data of the second client device that is the subject of the second storage operation request, the second application-specific data comprising a plurality of second data objects;identify portions within the plurality of second data objects to be considered for de-duplication; andinsert at least one second de-duplication indicator in the second application-specific data that identifies at least one location of the identified portions in the second data objects to be considered for de-duplication. 3. The system of claim 2, wherein the de-duplication module is configured to receive the second application-specific data from the second backup agent and to determine if a duplicate copy of the identified portions in the second data objects exists in the storage device. 4. The system of claim 2, wherein the first application-specific data is received from a first operating system and the second application-specific data is received from a second operating system different than the first operating system. 5. The system of claim 2, wherein: the first de-duplication database is configured to store unique blocks of the identified portions of the first data objects; andthe second de-duplication database is configured to store unique blocks of the identified portions of the second data objects, wherein the first de-duplication database is separate and different from the second de-duplication database. 6. The system of claim 1, wherein the application comprises an electronic mail server application. 7. A method for managing application-generated data objects, the method comprising: storing a first de-duplication database associated with first application-specific data;storing a second de-duplication database associated with second application-specific data;receiving a first storage operation request for first data generated by a first application and second data generated by a second application executing on a first client device, the first and second data comprising a plurality of first and second data objects having first and second formatprior to performing block-level de-duplication, inserting de-duplication indicators in the first and second data that identify portions within of the first and second data objects where de-duplication should start and stop, and wherein the inserted de-duplication indicators further identify which of the first and second de-duplication databases to use in de-duplicating the first and second application-specific data; andwith a de-duplication module executing on one or more computer processors that is configured to perform block-level de-duplication, the de-duplication module being in communication with a first backup agent to receive the first application-specific data:inserting the de-duplication indicators by setting or clearing a bit in at least one header of the first application-specific data; wherein the at least one de-duplication indicator comprises an offset value identifying a beginning of the first data objects within the first application-specific data;using the inserted de-duplication indicators to identify where the de-duplication module should start and stop de-duplication of blocks in identified portions within the first and second application-specific data;based on said inserted de-duplication indicators, determining if a duplicate copy of any of the blocks in the identified portions of the first application-specific data exists in the first de-duplication database; andbased on said inserted de-duplication indicators, determining if a duplicate copy of any of the identified blocks in the portions of the second application-specific data exists in the second de-duplication database. 8. The method of claim 7, wherein said processing of each of the first data objects further comprises: generating a substantially unique identifier that represents the first data object; andaccessing a database separate from the at least one storage device that stores a plurality of substantially unique identifiers of other data objects stored in the at least one storage device. 9. The method of claim 8, wherein said generating a substantially unique identifier comprises generating a hash value of the corresponding first data object. 10. The method of claim 7, wherein the plurality of first data objects comprises a body of an electronic mail message and an attachment. 11. The method of claim 7, wherein each of the first data objects comprises a file. 12. The method of claim 7, wherein the first data comprises a chunk file. 13. The method of claim 7, wherein said identifying portions within the first data objects is based at least in part on one or more file extensions associated with the first data objects. 14. The method of claim 7, wherein said inserting the de-duplication indicators comprises inserting the at least one de-duplication indicator in at least one header of the first data.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.