IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0938732
(2007-11-12)
|
등록번호 |
US-8326819
(2012-12-04)
|
발명자
/ 주소 |
- Indeck, Ronald S.
- Singla, Naveen
- Taylor, David E.
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
26 인용 특허 :
214 |
초록
▼
Disclosed herein is a method and system for hardware-accelerating the generation of metadata for a data stream using a coprocessor. Using these techniques, data can be richly indexed, classified, and clustered at high speeds. Reconfigurable logic such a field programmable gate arrays (FPGAs) can be
Disclosed herein is a method and system for hardware-accelerating the generation of metadata for a data stream using a coprocessor. Using these techniques, data can be richly indexed, classified, and clustered at high speeds. Reconfigurable logic such a field programmable gate arrays (FPGAs) can be used by the coprocessor for this hardware acceleration. Techniques such as exact matching, approximate matching, and regular expression pattern matching can be employed by the coprocessor to generate desired metadata for the data stream.
대표청구항
▼
1. An apparatus comprising: a reconfigurable logic device for communication with a processor, the reconfigurable logic device having a firmware pipeline deployed thereon, the firmware pipeline comprising a plurality of firmware application modules, at least one of the firmware application modules co
1. An apparatus comprising: a reconfigurable logic device for communication with a processor, the reconfigurable logic device having a firmware pipeline deployed thereon, the firmware pipeline comprising a plurality of firmware application modules, at least one of the firmware application modules comprising a regular expression pattern matching module, and at least another of the firmware application modules comprising an exact or approximate matching module, the firmware pipeline configured to, at a hardware rate,(1) receive a stream of data objects,(2) tag each data object with a data object identifier,(3) parse the data objects into a plurality of words,(4) tag the words with a plurality of position identifiers,(5) perform a regular expression pattern matching operation via the regular expression pattern matching module to detect whether any pattern matches exist between the streaming data objects and at least one pattern,(6) perform an exact or approximate matching operation via the exact or approximate matching module to detect whether any exact or approximate matches exist between the streaming data objects and a plurality of words in a dictionary,(7) concurrently generate multiple types of indexes for the streaming data objects, the multiple indexes comprising a general index, at least one pattern index based on the detected pattern matches, and at least one dictionary index based on the detected exact or approximate matches, the indexes comprising a plurality of terms and pointers, wherein each pointer is associated with a data object and a term, wherein each pointer comprises a data object identifier and a position identifier for a word within a data object that matches a term in an index, and(8) as data object identifiers, position identifiers, pattern matches, and exact or approximate matches continue to be generated for the streaming data objects, update the indexes based thereon such that a new pointer to a given data object is added to an index when the term associated with the new pointer is found within the given data object. 2. The apparatus of claim 1 further comprising the processor, the processor being in communication with the reconfigurable logic device via a bus, wherein the processor is configured to control delivery to the reconfigurable logic device of the data objects and commands for configuring the reconfigurable logic device. 3. The apparatus of claim 1 wherein the firmware pipeline is further configured to generate and update the indexes by performing a hashing operation on the data object identifiers and position identifiers corresponding to the detected pattern matches and the detected exact or approximate matches. 4. The apparatus of claim 1 wherein the reconfigurable logic device comprises resident memory for storing the indexes. 5. The apparatus of claim 4 further comprising an RDBMS in communication with the reconfigurable logic device via a bus, and wherein the reconfigurable logic device is further configured to merge the indexes into a plurality of operational indexes maintained by the RDBMS. 6. The apparatus of claim 1 wherein the exact or approximate matching operation comprises an approximate matching operation. 7. The apparatus of claim 1 wherein the firmware pipeline further comprises: a plurality of the regular expression pattern matching modules in parallel, each of the regular expression pattern matching modules being configured to detect a different pattern within the streaming data objects;a plurality of the exact or approximate matching modules in parallel, each of the exact or approximate matching modules being configured to detect an exact or approximate match within the streaming data objects with respect to a plurality of words in a plurality of different dictionaries; andwherein the reconfigurable logic device is further configured to (1) process the streaming data objects through the regular expression pattern matching modules and the exact or approximate matching modules in parallel, (2) generate the general index based on the data object identifiers and the position identifiers for the streaming data objects, (3) generate and update a plurality of the pattern indexes based on the pattern matches found by the parallel regular expression pattern matching modules such that each pattern index corresponds to a different one of the patterns, and (4) generate and update a plurality of the dictionary indexes based on the exact or approximate matches found by the parallel exact or approximate matching modules such that each dictionary index corresponds to a different one of the dictionaries. 8. The apparatus of claim 1 wherein the reconfigurable logic device comprises a field programmable gate array (FPGA). 9. The apparatus of claim 1 further comprising the processor, wherein the processor comprises a general purpose processor (GPP). 10. The apparatus of claim 1 wherein at least one of the patterns comprises a credit card number. 11. The apparatus of claim 1 wherein at least one of the patterns comprises a social security number. 12. The apparatus of claim 1 wherein at least one of the patterns comprises an email address. 13. The apparatus of claim 1 wherein at least one of the patterns comprises a telephone number. 14. The apparatus of claim 1 wherein at least one of the patterns comprises an Internet uniform resource locator (URL).
※ AI-Helper는 부적절한 답변을 할 수 있습니다.