A hash-optimized backup system and method takes data blocks and generates a probabilistically unique digital fingerprint of the content of each data block using a substantially collision-free algorithm. The process compares the generated fingerprint to a database of stored fingerprints and, if the g
A hash-optimized backup system and method takes data blocks and generates a probabilistically unique digital fingerprint of the content of each data block using a substantially collision-free algorithm. The process compares the generated fingerprint to a database of stored fingerprints and, if the generated fingerprint matches a stored fingerprint, the data block is determined to already have been backed up, and therefore does not need to be backed up again. Only if the generated fingerprint does not match a stored fingerprint is the data block backed up, at which point the generated fingerprint is added to the database of stored fingerprints. Because the algorithm is substantially collision-free, there is no need to compare actual data content if there is a hash-value match. The process can also be used to audit software license compliance, inventory software, and detect computer-file tampering such as viruses and malware.
대표청구항▼
1. A system for backing up a data block comprising: a backup server having access to a digital fingerprint database; anda source computer connected to the backup server via a communication path, the source computer being configured to generate a digital fingerprint of a data block in a data block lo
1. A system for backing up a data block comprising: a backup server having access to a digital fingerprint database; anda source computer connected to the backup server via a communication path, the source computer being configured to generate a digital fingerprint of a data block in a data block location in storage associated with the computer, using a substantially collision-free algorithm;wherein the backup server is configured to:compare the digital fingerprint to digital fingerprints stored in the database; and:if the digital fingerprint does not match one of the stored digital fingerprints, back up the data block and add to the database the digital fingerprint and the data block location in association with the added digital fingerprint; andif the digital fingerprint matches one of the stored digital fingerprint, add to the database the data block location in association with the stored digital fingerprint. 2. The system of claim 1, further comprising a storage device connected to the backup server for storing the database. 3. The system of claim 1, further comprising a storage device connected to the backup server for storing the backed-up data blocks. 4. The system of claim 3, wherein the network comprises comprises a local area network, a wide area network, and/or the Internet. 5. The system of claim 1, wherein the communication path comprises a network. 6. A method for detecting file tampering on a computer, comprising: generating, by a processing device, first digital fingerprints for each of a plurality of files on the computer using a substantially collision-free algorithm at a first time;generating, by the processing device, a second digital fingerprint for one of the plurality of files on the computer using the substantially collision-free algorithm at a second time after the first time;comparing, by the processing device, the second digital fingerprint with the first digital fingerprint of the one of the plurality of files generated at the first time; anddetermining, by the processing device, whether tampering exists on the one file based on the comparison. 7. The method of claim 6, further comprising dividing each file into data blocks and generating the first and second digital fingerprints of at least one data block. 8. The method of claim 6, wherein the file tampering comprises a computer virus. 9. The method of claim 6, wherein the algorithm comprises a hash function. 10. A method for detecting a computer virus on a computer, comprising: generating, by a first processing device, a first digital fingerprint of a computer virus using a substantially collision-free algorithm;generating, by a second processing device, second digital fingerprints for each of a plurality of files on the computer using the substantially collision-free algorithm;comparing, by the second processing device, the second digital fingerprints of the computer files with the first digital fingerprint of the computer virus; anddetermining, by the second processing device, whether the computer virus exists on the computer based on the comparison. 11. The method of claim 10, further comprising generating the second digital fingerprints by dividing each of the plurality of files into respective data blocks and generating a second digital fingerprint of each respective data block. 12. The method of claim 10, wherein the algorithm comprises a hash function. 13. A method for backing up data, comprising: generating, by a computer using a substantially collision free algorithm, a digital fingerprint of a data block stored in a data block location in a storage associated with the computer;sending, by the computer across a communication path, the digital fingerprint to a backup server;comparing, by the backup server, the digital fingerprint to digital fingerprints stored in a database;backing up the data block and adding to the database the digital fingerprint and the data block location in association with the added digital fingerprint, if the digital fingerprint does not match one of the stored digital fingerprints; andadding to the database the data block location in association with the stored digital fingerprint, if the digital fingerprint matches one of the stored digital fingerprints. 14. The method of claim 13, wherein the algorithm comprises a hash function. 15. The method of claim 14, wherein the hash function is MD5 or SHA-1.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (21)
Bitner Haim,ILX ; Ish-Shalom Ariel J.,ILX, Arrangement for filtering data item updates to reduce the number of updates to a data item to be stored on mass data storage facility.
Farber David A. ; Lachman Ronald D., Data processing system using substantially unique identifiers to identify data items, whereby identical data items hav.
Margolus,Norman H.; Knight, Jr.,Thomas F.; Floyd,Jered J.; Hartman,Sam; Homsy, II,George E., Data repository and method for promoting network storage of data.
Hubis Walter A. ; Otterness Noel S., Method and apparatus for providing a disc drive snapshot backup while allowing normal drive read, write, and buffering operations.
Kedem Nadav,ILX ; Bitner Haim,ILX, System and method for reconstructing data associated with protected storage volume stored in multiple modules of back-up mass data storage facility.
Nadav Kedem IL; Haim Bitner IL, System and method for reconstructing data associated with protected storage volume stored in multiple modules of back-up mass data storage facility.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.