IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0418659
(2003-04-18)
|
등록번호 |
US-7716148
(2010-06-03)
|
발명자
/ 주소 |
- Meng, Zhuo
- Duan, Baofu
- Pao, Yoh-Han
- Cass, Ronald J
|
출원인 / 주소 |
- Computer Associates Think, Inc.
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
2 인용 특허 :
68 |
초록
▼
An apparatus and method for processing mixed data for a selected task is provided. An input transformation module converts mixed data into converted data. A functional mapping module processes the converted data to provide a functional output for the selected task. The selected task may be one or a
An apparatus and method for processing mixed data for a selected task is provided. An input transformation module converts mixed data into converted data. A functional mapping module processes the converted data to provide a functional output for the selected task. The selected task may be one or a combination of a variety of possible tasks, including search, recall, prediction, classification, etc. For example, the selected task may be for data mining, database search, targeted marketing, computer virus detection, etc.
대표청구항
▼
What is claimed is: 1. An apparatus for processing mixed data for a selected task, comprising: a data collection agent comprising logic encoded in a computer-readable medium that when executed is operable to collect mixed data implemented in a plurality of encoding schemes, the mixed data collected
What is claimed is: 1. An apparatus for processing mixed data for a selected task, comprising: a data collection agent comprising logic encoded in a computer-readable medium that when executed is operable to collect mixed data implemented in a plurality of encoding schemes, the mixed data collected from a plurality of data sources wherein the plurality of data sources comprises at least two unique data sources comprising at least two unique encodings; a neural network comprising logic encoded in a computer-readable medium that when executed is operable to implement an input transformation module, the input transformation module adapted to: determine a complexity associated with the mixed data, the complexity based on a number of dimensions associated with the encoded data and a desired functional output for the selected task; upon determining the complexity is below a first threshold, transform the encoding into a first numerical encoding based on a pattern in the encoding; upon determining the complexity is above the first threshold: determine a distance metric for determining a distance between any two data points within a particular dimension of the encoding; scale at one distance of at least one dimension; cluster the mixed data based on the determined distance metric; transform the encoding into a second numerical encoding using a signpost transformation that dynamically adjusts a level of detail based on the desired functional output for the selected task; a functional link network comprising logic encoded in a computer readable medium that when executed is operable to process the transformed encoding to provide the functional output for the selected task; and a memory module operable to store the functional output. 2. The apparatus of claim 1, wherein the input transformation module uses a signpost transformation that self adapts by adding nodes until the transformed data meets a criteria. 3. The apparatus of claim 2, wherein cluster centers are set as reference points and distances from a mixed data to the respective reference points correspond to dimensions of the converted data space. 4. The apparatus of claim 2, wherein the input transformation module is trained through clustering of a mixed data training set. 5. The apparatus of claim 4, wherein the input transformation module uses a supervised learning methodology. 6. The apparatus of claim 4, wherein the input transformation module uses a k-means methodology for determining cluster centers. 7. The apparatus of claim 4, wherein the input transformation module uses a k-medoids methodology for determining cluster centers. 8. The apparatus of claim 1, wherein the mixed data includes consumer profile information. 9. The apparatus of claim 1, wherein the converted data is in a numerical representation. 10. The apparatus of claim 1, wherein the mixed data corresponds to text. 11. The apparatus of claim 1, wherein the input transformation module learns to organize mixed data patterns into sets corresponding to a plurality of nodes, and respective outputs of the nodes correspond to said converted data. 12. The apparatus of claim 11, wherein each node has an associated cluster annotation function. 13. The apparatus of claim 11, wherein the learning is unsupervised. 14. The apparatus of claim 1, wherein the functional link network includes a computational model with at least one basis function, and parameters of the at least one basis function are adjusted as the functional link network learns a training set of sample patterns associated with the selected task. 15. The apparatus of claim 14, wherein the functional link network includes an orthogonal functional link net. 16. The apparatus of claim 14, wherein the functional link network uses a regression technique for adjusting the parameters of the at least one basis function. 17. The apparatus of claim 16, wherein the at least one basis function includes a sigmoid. 18. The apparatus of claim 16, wherein the at least one basis function includes a wavelet. 19. The apparatus of claim 16, wherein the at least one basis function includes a radial basis function. 20. The apparatus of claim 16, wherein the at least one basis function includes a polynomial. 21. The apparatus of claim 14, wherein the learning by the functional link network is by a supervised, recursive least squares estimation method. 22. The apparatus of claim 14, wherein the functional link network includes a feed-forward net. 23. The apparatus of claim 22, wherein the feed-forward net is non-linear. 24. The apparatus of claim 22, wherein the feed-forward net learns by back-propagation of error. 25. The apparatus of claim 1, wherein the input transformation module and the functional link network comprise respective layers of a neural network. 26. The apparatus of claim 1, wherein the selected task is data mining. 27. The apparatus of claim 1, wherein the selected task is database searching. 28. The apparatus of claim 1, wherein the selected task is targeted marketing. 29. The apparatus of claim 1, wherein the selected task is computer virus detection. 30. The apparatus of claim 1, wherein the selected task is one of visualization, search, recall, prediction and classification. 31. The apparatus of claim 1: wherein the plurality of data sources comprises at least one local machine and at least one additional source selected from among an external data source and a proxied data source; and further comprising a historical database operable to store mixed data collected by the data collection agent for subsequent transformation by the neural network. 32. The apparatus of claim 1, wherein the at least two unique types of mixed data encodings comprise at least two types of mixed data selected from the group consisting of a webpage, an email, a customer profile, purchase data, medical information, speech samples, and handwriting samples. 33. A computer-implemented method of processing mixed data for a selected task, comprising: collecting mixed data implemented in a plurality of encoding schemes, the mixed data collected from a plurality of data sources wherein the plurality of data sources comprises at least two unique data sources comprising at least two unique types; determining a complexity associated with the mixed data, the complexity based on a number of dimensions associated with the encoded data and a desired functional output for the selected task; upon determining the complexity is below a first threshold, transforming the encoding into a first numerical encoding based on a pattern in the encoding; upon determining the complexity is above the first threshold: determining a distance metric for determining a distance between any two data points within a particular dimension of the encoding; scaling at one distance of at least one dimension; clustering the mixed data based on the determined distance metric; transforming the encoding into a second numerical encoding using a signpost transformation that dynamically adjusts a level of detail based on the desired functional output for the selected task; processing the transformed encoding to provide the functional output for the selected task; and storing the functional output. 34. The method of claim 33, wherein the mixed data is transformed into converted data through a signpost transformation that self-adapts by adding nodes until the transformed data meets a criteria. 35. The method of claim 34, wherein cluster centers are set as reference points and distances from a mixed data to the respective reference points correspond to dimensions of the converted data space. 36. The method of claim 33, wherein the mixed data is transformed into converted data through an encoding methodology. 37. The method of claim 36, wherein the mixed data includes consumer profile information. 38. A program storage device readable by a machine, tangibly embodying a program of instructions executable by the machine, when executed by the machine the program of instruction operable to: collect mixed data implemented in a plurality of encoding schemes, the mixed data collected from a plurality of data sources wherein the plurality of data sources comprises at least two unique data sources comprising at least two unique encodings; determine a complexity associated with the mixed data, the complexity based on a number of dimensions associated with the encoded data and a desired functional output for the selected task; upon determining the complexity is below a first threshold, transform the encoding into a first numerical encoding based on a pattern in the encoding; upon determining the complexity is above the first threshold: determine a distance metric for determining a distance between any two data points within a particular dimension of the encoding; scale at one distance of at least one dimension; cluster the mixed data based on the determined distance metric; transform the encoding into a second numerical encoding using a signpost transformation that dynamically adjusts a level of detail based on the desired functional output for the selected task; process the transformed encoding to provide the functional output for the selected task; and store the functional output. 39. A computing system, comprising: a processor; and a program storage device readable by the computer system, tangibly embodying a program of instructions executable by the processor to: collect mixed data implemented in a plurality of encoding schemes, the mixed data collected from a plurality of data sources wherein the plurality of data sources comprises at least two unique data sources comprising at least two unique types of mixed data; determine a complexity associated with the mixed data, the complexity based on a number of dimensions associated with the encoded data and a desired functional output for the selected task; upon determining the complexity is below a first threshold, transform the encoding into a first numerical encoding based on a pattern in the encoding; upon determining the complexity is above the first threshold: determine a distance metric for determining a distance between any two data points within a particular dimension of the encoding; scale at one distance of at least one dimension; cluster the mixed data based on the determined distance metric; transform the encoding into a second numerical encoding using a signpost transformation that dynamically adjusts a level of detail based on the desired functional output for the selected task; process the transformed data to provide the functional output for the selected task; and store the functional output. 40. The apparatus of claim 1, wherein: the functional link network is updated based on transforming the mixed data; and the memory module is further operable to store the updated functional link network. 41. The apparatus of claim 1, wherein the distance metric comprises: d = ∑ i A w Ai + ∑ j B w Bj - ∑ k A ⋂ B ( w Ak + w Bk ) ∑ i A w Ai + ∑ j B w Bj - 1 2 ∑ k A ⋂ B ( w Ak + w Bk ) .
※ AI-Helper는 부적절한 답변을 할 수 있습니다.