IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0992024
(2004-11-18)
|
등록번호 |
US-7509423
(2009-03-24)
|
발명자
/ 주소 |
- Douceur,John R.
- Theimer,Marvin M.
- Adya,Atul
- Bolosky,William J.
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
5 인용 특허 :
103 |
초록
▼
Potentially identical objects (e.g., files) are located across multiple computers based on stochastic partitioning of workload. For each of a plurality of objects stored on a plurality of computers in a network, a portion of object information corresponding to the object is selected. The object inf
Potentially identical objects (e.g., files) are located across multiple computers based on stochastic partitioning of workload. For each of a plurality of objects stored on a plurality of computers in a network, a portion of object information corresponding to the object is selected. The object information can be generated in a variety of manners (e.g., based on hashing the object, based on characteristics of the object, and so forth). Any of a variety of portions of the object information can be used (e.g., the least significant bits of the object information). A stochastic partitioning process is then used to identify which of the plurality of computers to communicate the object information to for identification of potentially identical objects on the plurality of computers.
대표청구항
▼
The invention claimed is: 1. A method of locating potentially identical objects across multiple computers based on stochastic partitioning of workload, the method comprising: generating object information by an information generation unit of a computing device, the object information being identifi
The invention claimed is: 1. A method of locating potentially identical objects across multiple computers based on stochastic partitioning of workload, the method comprising: generating object information by an information generation unit of a computing device, the object information being identified as an imprint by a pre-calculated number of bits of object information; selecting, by the information generation unit of a computing device, for each of a plurality of objects stored on a plurality of computers in a network, an imprint corresponding to the object; using a stochastic partitioning process by a forwarding location determination unit of a computing device to identify which of the plurality of computers to communicate the object information to for identification of potentially identical objects on the plurality of computers, the stochastic partitioning process comprising: comparing, for each of the plurality of computers, the imprint to a portion of a computer identifier associated with the computer; identifying which of the computer identifiers have portions matching the imprint; and communicating the object information to each of the computers associated with a computer identifier having a portion matching the imprint. 2. A method as recited in claim 1, wherein the object information comprises file information and wherein each of the plurality of objects comprises a file in a file system. 3. A method as recited in claim 1, wherein the stochastic partitioning process comprises a fully distributed stochastic partitioning process. 4. A method as recited in claim 1, wherein the stochastic partitioning process comprises a group-based system using directory services process including: accessing an object information portion to computer mapping on a remote computer; and identifying one or more computers to receive the object information based at least in part on the accessed mapping. 5. A method as recited in claim 1, wherein the stochastic partitioning process comprises a multi-level stochastic partitioning process including: grouping, into a plurality of groups, selected ones of the plurality of computers, wherein the grouping is based at least in part on the number of the plurality of computers in the network that the computer using the stochastic partitioning process is aware of; and identifying which of the selected ones of the plurality of computers to communicate the object information to, wherein the identifying is based at least in part on comparing the selected portion of the object information to a portion of a computer identifier of one or more of the selected ones of the plurality of computers. 6. One or more computer storage media having stored thereon a plurality of instructions that, when executed by one or more processors of one of a plurality of computers in a network, causes the one or more processors to perform the following acts: generating a portion of file information by an information generation unit of a computing device, the portion of file information comprising a pre-calculated number of bits of file information; selecting a portion of file information corresponding to a file, wherein the portion of file information corresponding to the file is utilized in comparing an imprint for computer mapping; identifying a mapping of the portion of file information to one or more computers by accessing the imprint for computer mapping, wherein the imprint for computer mapping utilizes selected bits of a computer ID of the one or more computers and mapping occurs when the selected bits of the computer ID match the portion of file information; and communicating the file information to each of the identified one or more computers for identification of potentially identical files on the one or more computers. 7. One or more computer storage media as recited in claim 6, wherein the file information is a semi-unique value based at least in part on data in the file. 8. One or more computer storage media as recited in claim 6, wherein the file information is based at least in part on one or more characteristics of the file. 9. One or more computer storage media as recited in claim 6, wherein the size of the portion of the file information is based at least in part on a count of the plurality of computers in the network that the one computer is aware of. 10. One or more computer storage media as recited in claim 6, wherein the size of the portion of the file information is based at least in part on an average number of computers in the network that a particular file identifier should be communicated to. 11. One or more computer storage media as recited in claim 6, wherein the identifying comprises identifying the mapping by accessing a locally stored imprint for computer mapping. 12. One or more computer storage media as recited in claim 6, wherein the identifying comprises identifying the mapping by accessing an imprint to computer mapping stored at another computer. 13. A system comprising: a processing unit; a memory, coupled to the processing unit facilitating operation of: an interface configured to allow the system to communicate with a plurality of other computers; an information generation unit configured to generate a portion of object information, the portion of object information comprising a pre-calculated number bits of file information; a file information comparison module configured to select a portion of file information corresponding to a file, wherein the portion of file information corresponding to the file is utilized in comparing an imprint for computer mapping; and a forwarding location determination module, coupled to the interface, configured to identify one or more of the plurality of other computers to communicate file information corresponding to a file to for identification of potentially identical files stored on the plurality of other computers by accessing a mapping of a portion of the file information to one or more of a plurality of other computers. 14. A system as recited in claim 13, wherein the file information is a semi-unique value based at least in part on the data in the file. 15. A system as recited in claim 13, wherein the size of the portion of the file information is based at least in part on a count of computers coupled to the system that the system is aware of. 16. A system as recited in claim 13, further comprising a locally stored imprint to computer mapping, and wherein the forwarding location determination module is configured to identify the one or more of a plurality of other computers to communicate the file information to by accessing the locally stored imprint to computer mapping. 17. A system as recited in claim 13, wherein the forwarding location determination module is configured to identify the one or more of a plurality of other computers to communicate the file information to by accessing an imprint for computer mapping stored at another computer.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.