최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
DataON 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Edison 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Kafe 바로가기국가/구분 | United States(US) Patent 등록 |
---|---|
국제특허분류(IPC7판) |
|
출원번호 | US-0060339 (2013-10-22) |
등록번호 | US-10146845 (2018-12-04) |
발명자 / 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 | 피인용 횟수 : 0 인용 특허 : 169 |
Various methods and apparatuses are described for performing high speed format translations of incoming data, where the incoming data is arranged in a delimited data format. As an example, the data in the delimited data format can be translated to a mapped variable field format using pipelined opera
Various methods and apparatuses are described for performing high speed format translations of incoming data, where the incoming data is arranged in a delimited data format. As an example, the data in the delimited data format can be translated to a mapped variable field format using pipelined operations. A reconfigurable logic device can be used in exemplary embodiments as a platform for the format translation.
1. A computer-implemented method comprising: at least one member of a group consisting of (1) a reconfigurable logic device, (2) a graphics processor unit (GPU), (3) an application-specific integrated circuit (ASIC), and (4) a chip multi-processor (CMP) receiving an incoming stream comprising a plur
1. A computer-implemented method comprising: at least one member of a group consisting of (1) a reconfigurable logic device, (2) a graphics processor unit (GPU), (3) an application-specific integrated circuit (ASIC), and (4) a chip multi-processor (CMP) receiving an incoming stream comprising a plurality of bytes arranged in a delimited data format, the incoming byte stream being representative of data arranged in a plurality of fields, the incoming byte stream comprising a plurality of data characters and a plurality of field delimiter characters, the field delimiter characters defining a plurality of boundaries between the fields;the at least one member processing the received byte stream to identify the field delimiter characters that are present in the received byte stream; andthe at least one member translating the received byte stream to an outgoing byte stream arranged in a mapped variable field format based on the identified field delimiter characters, the outgoing byte stream comprising (1) a plurality of the data characters of the received byte stream arranged in a plurality of variable-size fields, and (2) header information, wherein the header information comprises a plurality of byte offset values that identify where a plurality of subsequent variable-size fields are located in the outgoing byte stream. 2. The method of claim 1 wherein the at least one member comprises the reconfigurable logic device. 3. The method of claim 2 wherein the incoming byte stream further comprises a plurality of shield characters; wherein the processing step further comprises the reconfigurable logic device identifying the shield characters that are present in the received byte stream; andwherein the translating step further comprises the reconfigurable logic device translating the received byte stream to the outgoing byte stream having the mapped variable field format based on the identified field delimiter characters and the identified shield characters. 4. The method of claim 3 wherein the translating step comprises the reconfigurable logic device removing the identified field delimiter characters from the outgoing byte stream. 5. The method of claim 4 wherein the translating step further comprises the reconfigurable logic device removing the identified shield characters from the outgoing byte stream. 6. The method of claim 3 further comprising the reconfigurable logic device converting the received byte stream to an internal format tagged with associated control data that identifies the boundaries between the fields. 7. The method of claim 6 wherein the converting step further comprises the reconfigurable logic device generating a shield character mask associated with the received byte stream to identify the bytes in the received byte stream that are eligible for consideration as to whether they contain a field delimiter character. 8. The method of claim 7 wherein the converting step further comprises the reconfigurable logic device processing the bytes of the received byte stream and the generated shield character mask to generate field delimiter flag data associated with the received byte stream, the field delimiter flag data being indicative of whether an associated byte corresponds to a field delimiter character. 9. The method of claim 8 wherein the incoming byte stream is further representative of a plurality of records, at least one of the records comprising at least one of the fields, the incoming byte stream further comprising a plurality of record delimiter characters, the record delimiter characters defining a plurality of boundaries between the records, and wherein the converting step further comprises the reconfigurable logic device processing the bytes of the received byte stream and the generated shield character mask to generate record delimiter flag data associated with the received byte stream, the record delimiter flag data being indicative of whether an associated byte corresponds to a record delimiter character. 10. The method of claim 9 wherein the converting step further comprises the reconfigurable logic device identifying any empty fields that exist within the received byte stream based on the field delimiter flag data and the record delimiter flag data. 11. The method of claim 10 wherein the converting step further comprises the reconfigurable logic device removing the field delimiter characters and the record delimiter characters from the internally formatted byte stream based on the field delimiter flag data and the record delimiter flag data. 12. The method of claim 11 wherein the converting step further comprises the reconfigurable logic device generating control data associated with the internally formatted byte stream, the control data comprising (1) a start of field flag, (2) an end of field flag, (3) a start of record flag, (4) an end of record flag, and (5) a field identifier. 13. The method of claim 6 wherein the shield character identifying step further comprises the reconfigurable logic device performing a shield character removal operation on the bytes of the received byte stream. 14. The method of claim 13 wherein the shield character removal performing step comprises the reconfigurable logic device (1) distinguishing between the data characters that match the shield character and the shield characters, and (2) removing the identified shield characters. 15. The method of claim 6 further comprising the reconfigurable logic device generating the outgoing byte stream in the mapped variable field format from the internally formatted byte stream and the associated control data. 16. The method of claim 15 wherein the generating step further comprises the reconfigurable logic device determining byte lengths for the fields that are present in the internally formatted data based on the associated control data and generating field header data for the header information, wherein the field header data is indicative of the determined byte lengths for the fields. 17. The method of claim 16 wherein the field header data generating step comprises the reconfigurable logic device computing an array of byte offset values indicative of boundaries for a plurality of fields of a record in the outgoing byte stream, wherein the byte offset values in the header information include the array of byte offset values. 18. The method of claim 13 further comprising: the reconfigurable logic device providing the outgoing byte stream to a data processing component for processing thereby; andthe data processing component selectively targeting a field of the outgoing byte stream for processing without analyzing the data characters of the outgoing byte stream. 19. The method of claim 18 wherein the header information in the outgoing byte stream includes a plurality of record headers and a plurality of field headers, the record headers comprising data indicative of where boundaries exist between a plurality of records in the outgoing byte stream, the field headers comprising data indicative of where boundaries exist between a plurality of fields in the records, and wherein the selectively targeting step comprises the data processing component selectively targeting the field based on the data in the field headers. 20. The method of claim 18 further comprising: the reconfigurable logic device receiving processed data representative of the outgoing byte stream from the data processing component; andthe reconfigurable logic device translating the processed data back to the delimited data format. 21. The method of claim 3 further comprising: the reconfigurable logic device converting the received byte stream to an internal format tagged with associated control data that identifies the boundaries between the fields;the reconfigurable logic device performing a shield character removal operation on the bytes of the received byte stream; andthe reconfigurable logic device generating the outgoing byte stream in the mapped variable field format from the internally formatted byte stream and the associated control data; andwherein the reconfigurable logic device performs the converting step, the shield character removal performing step, and the generating step simultaneously with respect to each other in a pipelined fashion. 22. The method of claim 2 wherein the reconfigurable logic device performs the processing and translating steps for a plurality of characters in the byte stream per clock cycle. 23. The method of claim 1 wherein the header information in the outgoing byte stream includes a plurality of record headers and a plurality of field headers, the record headers comprising data indicative of where boundaries exist between a plurality of records in the outgoing byte stream, the field headers comprising data indicative of where boundaries exist between a plurality of fields in the records. 24. The method of claim 1 wherein the delimited data format comprises a comma separated value (CSV) format. 25. An apparatus comprising: at least one member of a group consisting of (1) a reconfigurable logic device, (2) a graphics processor unit (GPU), (3) an application-specific integrated circuit (ASIC), and (4) a chip multi-processor (CMP), the at least one member configured to (1) receive an incoming stream comprising a plurality of bytes arranged in a delimited data format, the incoming byte stream being representative of data arranged in a plurality of fields, the incoming byte stream comprising a plurality of data characters and a plurality of field delimiter characters, the field delimiter characters defining a plurality of boundaries between the fields, (2) process the received byte stream to identify the field delimiter characters that are present in the received byte stream, and (3) translate the received byte stream to an outgoing byte stream arranged in a mapped variable field format based on the identified field delimiter characters, the outgoing byte stream comprising (1) a plurality of the data characters of the received byte stream arranged in a plurality of variable-size fields, and (2) header information, wherein the header information comprises a plurality of byte offset values that identify where a plurality of subsequent variable-size fields are located in the outgoing byte stream. 26. A computer-implemented method comprising: at least one member of a group consisting of (1) a reconfigurable logic device, (2) a graphics processor unit (GPU), (3) an application-specific integrated circuit (ASIC), and (4) a chip multi-processor (CMP) receiving an incoming stream comprising a plurality of bytes arranged in a delimited data format, the incoming byte stream being representative of data arranged in a plurality of fields, the incoming byte stream comprising a plurality of data characters and a plurality of field delimiter characters, the field delimiter characters defining a plurality of boundaries between the fields;the at least one member processing the received byte stream to identify the field delimiter characters that are present in the received byte stream; andthe at least one member translating the received byte stream to an outgoing byte stream based on the identified field delimiter characters, the outgoing byte stream arranged in a structured format and being representative of the data in the fields of the received byte stream, the outgoing byte stream comprising (1) a plurality of the data characters of the received byte stream, and (2) header information indicative of where boundaries exist between a plurality of fields in the outgoing byte stream, the structured format comprising a mapped variable field format that is configured to permit a downstream processing component to jump from field to field in the outgoing byte stream based on the header information without analyzing the data characters of the outgoing byte stream. 27. The method of claim 26 wherein the at least one member comprises the reconfigurable logic device. 28. The method of claim 27 wherein the incoming byte stream is further representative of a plurality of records, at least one of the records comprising at least one of the fields, the incoming byte stream further comprising a plurality of record delimiter characters, the record delimiter characters defining a plurality of boundaries between the records; wherein the processing step further comprises the reconfigurable logic device identifying the record delimiter characters that are present in the received byte stream; andwherein the translating step further comprises the reconfigurable logic device translating the received byte stream to the outgoing byte stream having the structured format based on the identified field delimiter characters and the identified record delimiter characters. 29. The method of claim 28 wherein the structured format is further configured to permit the downstream processing component to jump from record to record in the outgoing byte stream without analyzing the data characters of the outgoing byte stream. 30. The method of claim 28 wherein the translating step further comprises the reconfigurable logic device removing the identified record delimiter characters from the outgoing byte stream. 31. The method of claim 27 wherein the incoming byte stream further comprises a plurality of shield characters; wherein the processing step further comprises the reconfigurable logic device identifying the shield characters that are present in the received byte stream; andwherein the translating step further comprises the reconfigurable logic device translating the received byte stream to the outgoing byte stream having the structured format based on the identified field delimiter characters and the identified shield characters. 32. The method of claim 31 wherein the translating step further comprises the reconfigurable logic device removing the identified shield characters from the outgoing byte stream. 33. The method of claim 27 wherein the translating step further comprises removing the identified field delimiter characters from the outgoing byte stream. 34. The method of claim 27 further comprising the reconfigurable logic device providing the outgoing byte stream to the data processing component for processing thereby. 35. The method of claim 34 further comprising the data processing component performing a plurality of processing operations on the outgoing byte stream to generate processed data from the outgoing byte stream. 36. The method of claim 35 wherein the processing operations include a plurality of extract, transfer, and load (ETL) database operations. 37. The method of claim 35 wherein the processing operations comprise a plurality of data validation operations. 38. The method of claim 27 further comprising the reconfigurable logic device translating the processed data back to the delimited data format of the received byte stream. 39. The method of claim 27 wherein the data processing component is implemented on the reconfigurable logic device. 40. The method of claim 27 wherein the data processing component is implemented in software on a processor. 41. The method of claim 26 wherein the delimited data format comprises a comma separated value (CSV) format. 42. The method of claim 27 wherein the reconfigurable logic device performs the processing and translating steps for a plurality of characters in the byte stream per clock cycle. 43. A computer-implemented method comprising: receiving data in a delimited data format;converting the received data to a mapped variable field format, wherein the converted data includes (1) a plurality of records, wherein each of a plurality of the records includes a plurality variable-size fields, and (2) header data indicative of where boundaries exist between a plurality of records in the converted data and where boundaries exist between a plurality of fields in the converted data, and wherein the header data includes a plurality of byte offset values that identify where boundaries exist between a plurality of subsequent variable-size fields in the converted data; andperforming a plurality of processing operations on the converted data to generate processed data in the mapped variable field format; andloading the processed data into a database; andwherein the converting step is performed by at least one member of a group consisting of (1) a reconfigurable logic device, (2) a graphics processor unit (GPU), (3) an application-specific integrated circuit (ASIC), and (4) a chip multi-processor (CMP). 44. The method of claim 43 wherein the converted data comprises a plurality of data fields, the data fields having a variable lengths, wherein the processing operations comprise a plurality of field-specific data processing operations, and wherein the performing step comprises targeting a specific field of the converted data for a field-specific processing operation based on the header data without analyzing the data content of the data fields. 45. The method of claim 43 wherein the data processing operations comprise data quality checking operations as part of an extract, transfer, load (ETL) procedure. 46. The method of claim 43 wherein the at least one of the processing operations is performed by software executed by a processor. 47. The method of claim 43 wherein the converting step comprises converting a plurality of characters of the received data to the mapped variable field format per clock cycle. 48. The method of claim 43 wherein the at least one of the processing operations is performed by the at least one member. 49. An apparatus comprising: a reconfigurable logic device comprising a data translation pipeline, the pipeline comprising (1) a first hardware logic circuit configured to convert incoming data arranged in a delimited data format to an internal format, the incoming data in the delimited data format comprising a plurality of data characters, a plurality of field delimiter characters, a plurality of record delimiter characters, and a plurality of shield characters, the converted data having the internal format being stripped of field delimiter characters and record delimiter characters while preserving data characters of incoming fields, and wherein the converted data having the internal format includes associated control data indicative of where boundaries exist between a plurality of records in the converted data and where boundaries exist between a plurality of fields in the converted data, and (2) a second hardware logic circuit downstream from the first hardware logic circuit, the second hardware logic circuit configured to remove shield characters from the converted data having the internal format;a hardware-accelerated data processing stage configured to perform a data processing operation on output from the second hardware logic circuit to thereby generate processed data. 50. The apparatus of claim 49 wherein the first hardware logic circuit is further configured to simultaneously test the same characters of the incoming data to determine whether the tested characters are record delimiter characters or field delimiter characters. 51. A computer-implemented method comprising: converting incoming data arranged in a delimited data format to an internal format, the incoming data in the delimited data format comprising a plurality of data characters, a plurality of field delimiter characters, a plurality of record delimiter characters, and a plurality of shield characters, the converted data having the internal format being stripped of field delimiter characters and record delimiter characters while preserving data characters of incoming fields, and wherein the converted data having the internal format includes associated control data indicative of where boundaries exist between a plurality of records in the converted data and where boundaries exist between a plurality of fields in the converted data;removing shield characters from the converted data;performing at least one hardware-accelerated processing operation on at least a portion of the converted data to generate processed data;loading the processed data into a database; andwherein the converting step is performed by at least one member of a group consisting of (1) a reconfigurable logic device, (2) a graphics processor unit (GPU), (3) an application-specific integrated circuit (ASIC), and (4) a chip multi-processor (CMP). 52. The method of claim 51 wherein the at least one hardware-accelerated data processing operation comprises a plurality of field-specific hardware-accelerated data processing operations, and wherein the performing step comprises targeting a specific field of the converted data for a field-specific hardware-accelerated processing operation based on the associated control data without analyzing the data content of the data fields. 53. The method of claim 51 wherein the data processing operations comprise data quality checking operations as part of an extract, transfer, load (ETL) procedure. 54. The method of claim 51 further comprising: converting the shield-removed converted data into data in a fixed field format; andperforming, by software executed by a processor, at least one processing operation on at least a portion of the data in the fixed field format to generate additional processed data; andloading the additional processed data into the database. 55. The method of claim 51 further comprising: converting the shield-removed converted data into data in a mapped variable field format; andperforming, by software executed by a processor, at least one processing operation on at least a portion of the data in the mapped variable field format to generate additional processed data; andloading the additional processed data into the database. 56. The method of claim 51 wherein the converting step comprises converting a plurality of characters of the received data to the internal format per clock cycle. 57. The apparatus of claim 49 wherein the hardware-accelerated data processing stage is deployed on the reconfigurable logic device. 58. The method of claim 1 wherein the at least one member comprises the GPU. 59. The method of claim 1 wherein the at least one member comprises the ASIC. 60. The method of claim 1 wherein the at least one member comprises the CMP. 61. The apparatus of claim 25 wherein the at least one member comprises the reconfigurable logic device. 62. The apparatus of claim 25 wherein the at least one member comprises the GPU. 63. The apparatus of claim 25 wherein the at least one member comprises the ASIC. 64. The apparatus of claim 25 wherein the at least one member comprises the CMP. 65. The method of claim 26 wherein the at least one member comprises the GPU. 66. The method of claim 26 wherein the at least one member comprises the ASIC. 67. The method of claim 26 wherein the at least one member comprises the CMP. 68. The method of claim 51 wherein the at least one member comprises the reconfigurable logic device. 69. The method of claim 51 wherein the at least one member comprises the GPU. 70. The method of claim 51 wherein the at least one member comprises the ASIC. 71. The method of claim 51 wherein the at least one member comprises the CMP. 72. The method of claim 26 wherein the mapped variable field format comprises (1) a plurality of the data characters of the received byte stream arranged in variable-size fields, and (2) header information comprising a plurality of byte offset values that identify boundaries between a plurality of subsequent variable-size fields in the outgoing byte stream.
Copyright KISTI. All Rights Reserved.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.