IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0954639
(2013-07-30)
|
등록번호 |
US-9092487
(2015-07-28)
|
발명자
/ 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
Marsh Fischmann & Breyfogle LLP
|
인용정보 |
피인용 횟수 :
0 인용 특허 :
29 |
초록
▼
A system and method (a “utility”) is provided for improving the accuracy of a content matching analysis that identifies a composition of an item of protectable content of a user. The item of protectable content may include a portion of source code or object code, individual or bundled source code or
A system and method (a “utility”) is provided for improving the accuracy of a content matching analysis that identifies a composition of an item of protectable content of a user. The item of protectable content may include a portion of source code or object code, individual or bundled source code or object code files, binary code files, directory structures and/or trees, open source software projects or packages, and/or proprietary software applications or packages. The utility involves storing a number of items of comparison content on a storage structure, receiving an item of user content at a computer-based content exchange, and comparing the item of user content to the items of comparison content to determine, from among the items of comparison content, one or more potential matches that each include a matched portion that is similar to a portion of the item of user content. The utility further includes selecting a noise reduction technique from a number of noise reduction techniques available to the content exchange and applying the noise reduction technique to eliminate noise and/or false positives (e.g., one or more redundant matches and/or erroneously identified matches) from the potential matches.
대표청구항
▼
1. A method for use in analyzing protectable content, comprising: storing, in a memory structure, a plurality of items of comparison content in a first format and a second format;receiving, from a user at a computer-based content exchange, an item of user content in said first format, at least one o
1. A method for use in analyzing protectable content, comprising: storing, in a memory structure, a plurality of items of comparison content in a first format and a second format;receiving, from a user at a computer-based content exchange, an item of user content in said first format, at least one of said plurality of items of comparison content and said item of user content including open source content;converting said item of user content from said first format to said second format, wherein said converting comprises abstracting one or more interchangeable elements of said item of user content, and wherein said interchangeable elements combine to comprise less than an entirety of said item of user content, said interchangeable elements indicative of a non-functional element associated with said item of user content;comparing, using said computer-based content exchange, said item of user content in said second format to said items of comparison content in said second format; andin response to said comparing, determining that said item of user content and at least one of said items of comparison content each include a substantially similar portion. 2. A method as set forth in claim 1, wherein each of said plurality of items of comparison content and said item of user content comprise open source or proprietary content and are formed of one or more portions of source code, one or more portions of binary code, one or more source code files or binary code files, one or more directory structures, software projects, software applications, or software packages. 3. A method as set forth in claim 1, wherein said converting step comprises applying a one-way hashing function to said item of user content in said first format. 4. A method as set forth in claim 1, wherein said first format comprises code text and said second format comprises one or more hashed signatures associated with said code text. 5. A method as set forth in claim 1, wherein said interchangeable elements include one or more of variable names, comments, new line characters, line ending characters, spaces, and tabs. 6. A method as set forth in claim 1, wherein said abstracting comprises removing said interchangeable elements. 7. A method as set forth in claim 1, wherein said abstracting comprises replacing each said interchangeable element with a generic element. 8. A method as set forth in claim 7, wherein said generic element is a wildcard. 9. A method as set forth in claim 1, wherein said first format for said item of user content is incompatible with said first format for one or more of said items of comparison content. 10. A method as set forth in claim 9, wherein said item of user content in said first format exists in a first file type and said one or more of said items of comparison content in said first format exist in a second file type. 11. A method as set forth in claim 1, wherein said converting step comprises dividing said item of user content into portions prior to said abstracting step. 12. A method as set forth in claim 11, wherein each said portion comprises a file, a block of content within said file, or a bundle of files. 13. A method as set forth in claim 12, wherein said block of content comprises a defined number of lines of code or a defined number of code characters. 14. A method as set forth in claim 1, further comprising: using metadata associated with said item of user content and said items of comparison content, operating said computer-based content exchange for: locating said substantially similar portion within said first format of said item of user content and said first format of at least one of said items of comparison content; andcomparing said substantially similar portion within said item of user content to said substantially similar portion within said at least one of said items of comparison content. 15. A method as set forth in claim 14, wherein said comparing said substantially similar portions comprises comparing code text. 16. A method as set forth in claim 15, wherein said comparing said substantially similar portions occurs on a private network maintained behind a firewall associated with said user. 17. A system for analyzing user content, comprising: a memory structure for storing a plurality of items of comparison content and one or more comparison signatures corresponding to each said item of comparison content, said one or more comparison signatures having previously been abstracted of one or more interchangeable elements of at least one of said plurality of items of comparison content;a scanner for receiving an item of user content from a user and creating one or more user signatures associated with said item of user content, wherein each said user signature abstracts one or more interchangeable elements of said item of user content, said one or more interchangeable elements indicative of a non-functional element associated with said item of user content; anda processor for comparing said user signatures to said comparison signatures and determining whether at least one of said items of comparison content includes a matched potion that is similar to a portion of said item of user content. 18. A system as set forth in claim 17, wherein each said item of comparison content and said item of user content comprise open source or proprietary software and include portions of source code, portions of binary code, source code files, binary code files, directory structures, directory trees, software projects, software applications, or software packages. 19. A system as set forth in claim 17, wherein each said comparison signature and each said user signature comprises a hash value. 20. A system as set forth in claim 17, wherein said interchangeable elements include variable names, comments, new line characters, line ending characters, spaces, and tabs. 21. A system as set forth in claim 17, wherein said scanner removes said interchangeable elements. 22. A system as set forth in claim 17, wherein said scanner replaces each said interchangeable element with a generic element. 23. A system as set forth in claim 22, wherein said generic element is a wildcard. 24. A system as set forth in claim 17, wherein said comparison signatures and said user signatures are created from different file types. 25. A system as set forth in claim 17, wherein each said user signature represents a portion of said item of user content. 26. A system as set forth in claim 25, wherein each said portion comprises a file, a block of content within said file, or a bundle of files, and wherein said block of content comprises a defined number of lines of code or a defined number of code characters. 27. A system as set forth in claim 17, further comprising an interface structure for outputting an identification of said items of comparison content that include said matched portion. 28. A system as set forth in claim 17, wherein said scanner resides on a private network behind a firewall. 29. A system as set forth in claim 28, wherein said processor uses metadata associated with said comparison signatures to locate said matched portion within said item of comparison content and metadata associated with said user signatures to locate said portion within said item of user content, allowing said matched portion of said item of comparison content and said portion of said item of user content to be loaded for direct comparison by said user. 30. A system as set forth in 29, wherein said item of user content remains behind said firewall, and wherein said portion of said item of user content is loaded for said direct comparison behind said firewall. 31. A system as set forth in claim 29, wherein said direct comparison comprises comparing code text of said portion of said item of user content directly against code text of said matched portion of said item of comparison content. 32. A method of analyzing user content, comprising: storing, in a memory structure, a plurality of comparison signatures, wherein each said comparison signature is associated with an item of comparison content, at least one of said, plurality of comparison signatures having previously been abstracted of one or more interchangeable elements of said item of comparison content;receiving, at a scanner located at a user node, an item of user content;dividing said item of user content into one or more portions;creating, using said scanner, a user signature associated with each said portion, wherein each said user signature abstracts one or more interchangeable elements of said portion;comparing, using a computer-based content exchange, said user signatures to said comparison signatures; andfrom said comparing, determining which of said items of comparison content include a matched portion that is similar to one of said portions of said item of user content.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.