IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0062069
(2002-01-31)
|
발명자
/ 주소 |
- Chau, Hoang K.
- Cheng, Isaac Kam-Chak
- Cheng, Josephine Miu
- Chiu, Suet Mui
- Chow, Jyh-Herng
- Pauser, Michael Leon
- Xu, Jian
|
출원인 / 주소 |
- International Business Machines Corporation
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
244 인용 특허 :
4 |
초록
▼
A technique is provided to store fragmented XML data into a relational database by decomposing XML documents with application specific mappings. Data stored on a data store that is connected to a computer is transformed. Initially, an XML document containing XML data is received. A document access d
A technique is provided to store fragmented XML data into a relational database by decomposing XML documents with application specific mappings. Data stored on a data store that is connected to a computer is transformed. Initially, an XML document containing XML data is received. A document access definition that identifies one or more relational tables and columns is received. The XML data is mapped from the application DTD to the relational tables and columns using the document access definition by generating a first document object model tree using the XML document, generating a second document object model tree using the document access definition, and mapping the data from the first document object model tree into columns in one or more relational rabies using the second document object model tree.
대표청구항
▼
A technique is provided to store fragmented XML data into a relational database by decomposing XML documents with application specific mappings. Data stored on a data store that is connected to a computer is transformed. Initially, an XML document containing XML data is received. A document access d
A technique is provided to store fragmented XML data into a relational database by decomposing XML documents with application specific mappings. Data stored on a data store that is connected to a computer is transformed. Initially, an XML document containing XML data is received. A document access definition that identifies one or more relational tables and columns is received. The XML data is mapped from the application DTD to the relational tables and columns using the document access definition by generating a first document object model tree using the XML document, generating a second document object model tree using the document access definition, and mapping the data from the first document object model tree into columns in one or more relational rabies using the second document object model tree. ata point p. 2. The method according to claim 1, wherein the partitioning step of the plurality of data points in the data set are partitioned using a clustering algorithm. 3. The method according to claim 1, wherein the computing step includes: for each one of the plurality of partitions, calculating a distance of at least one neighboring data point from the plurality of data points in the partition, the lower bound being the smallest distance from the at least one neighboring data point to a first one of the plurality of data points in the partition and the upper bound being the largest distance from the at least one neighboring data point to a second one of the plurality of data points in the partition. 4. The method according to claim 3, wherein: for the predetermined number of outliers of interest, a number of partitions having the largest lower bound values are selected such that the number of data points residing in such partitions is at least equal to the predetermined number of outliers of interest; the identifying of the plurality of candidate partitions includes identifying which of the candidate partitions are comprised of those partitions having upper bound values that are greater than or equal to the smallest lower bound value of the number of partitions; and the non-candidate partitions are comprised of those partitions having upper bound values that are less than the smallest lower bound value of the number of partitions, the non-candidate partitions being eliminated from consideration because they do not contain the at least one of the predetermined number of outliers of interest. 5. The method according to claim 1, wherein the candidate partitions are smaller than a main memory, and further comprising the step of: storing all of the data points in the candidate partitions in a main memory spatial index; and identifying the predetermined number of outliers of interest using an index-based algorithm which probes the main memory spatial index. 6. The method according to claim 1, wherein the candidate partitions are larger than a main memory, then the partitions are processed in batches such that the overlap between each one of the partitions in a batch is as large as possible so that as many points as possible are processed in each batch. 7. The method according to claim 6, wherein each one of the batches is comprised of a subset of the plurality of candidate partitions, the subset being smaller than the main memory. 8. The method according to claim 6, wherein the predetermined number of outliers of interest are selected from the batch processed candidate partitions, the predetermined number of outliers of interest being those data points residing the farthest from their at least one neighboring data point. 9. The method according to claim 8, wherein the step of identifying the predetermined number of outliers of interest uses an index-based algorithm. 10. The method according to claim 8, wherein the step of identifying the predetermined number of outliers of interest uses a block nested-loop algorithm. 11. The method according to claim 1, wherein: the partitioning step includes calculating a minimum bounding rectangle MBR for each one of the plurality of data points in the data set; and the computing step of the lower and upper bounds being computed for each one of the data points in the MBR. 12. The method according to claim 11, wherein the minimum distance between a point p and an MBR R is denoted by MINDIST (p, R) defined as MINDIST (p, R)=Σδi=Ix2i,wherein and wherein every point in MBR R is at a distance of at least MINDIST (p, R) from point p; the point p in δ-dimensional space is denoted by [p1,p2,. . . , pδ]; and the MBR R is a δ-dimensional rectangle R denoted by the two endpoints of its major diagonal: r=[r1,r2,. . . , rδ] and r'=[r'1,r'2,
※ AI-Helper는 부적절한 답변을 할 수 있습니다.