IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0716505
(2007-03-08)
|
등록번호 |
US-7797310
(2010-10-04)
|
발명자
/ 주소 |
- Idicula, Sam
- Murthy, Ravi
- Chandrasekar, Sivasankaran
- Agarwal, Nipun
|
출원인 / 주소 |
- Oracle International Corporation
|
대리인 / 주소 |
Hickman Palermo Truong & Becker LLP
|
인용정보 |
피인용 횟수 :
1 인용 특허 :
201 |
초록
▼
A method and apparatus for estimating the cost of streaming evaluation of XPaths is provided. Aggregate statistics are maintained by the database server upon initiation of a database function by the database administrator about the nodes of the XML document. Based upon these statistics and the compl
A method and apparatus for estimating the cost of streaming evaluation of XPaths is provided. Aggregate statistics are maintained by the database server upon initiation of a database function by the database administrator about the nodes of the XML document. Based upon these statistics and the complexity of the particular XPath query, an estimate of the cost of the query, in time and computing resources required, is computed.
대표청구항
▼
What is claimed is: 1. A method to estimate a cost for computing a query on XML documents stored in a database, the method comprising the steps of: maintaining a plurality of statistics about nodes in said XML documents; based upon said plurality of statistics, estimating a cost for computing at le
What is claimed is: 1. A method to estimate a cost for computing a query on XML documents stored in a database, the method comprising the steps of: maintaining a plurality of statistics about nodes in said XML documents; based upon said plurality of statistics, estimating a cost for computing at least one path expression in said query on said XML documents, said cost comprising an estimated CPU cost and an estimated I/O cost; wherein the cost of computing the at least one path expression is determined based on a mathematical function of the estimated CPU cost and the estimated I/O cost; wherein computing said at least one path expression is performed using streaming evaluation; wherein estimating a cost for computing a path expression of the at least one path expression includes: estimating an input-size of said XML documents, said input-size being based on units of bytes; based on a portion of said plurality of statistics about said nodes, estimating an output-size associated with said path expression; wherein the steps are performed by one or more computing devices. 2. The method of claim 1 wherein said statistics are maintained upon receipt of a command to gather statistics for the database system. 3. The method of claim 1, wherein said statistics are stored in an XML structural summary of said XML documents, with annotations that contain statistics about each node in said XML structural summary. 4. The method of claim 1 wherein the cost of computing the query is the weighted sum of an estimated CPU cost and an estimated I/O cost. 5. The method of claim 4 wherein said estimated CPU cost is computed with an input size of data to be queried, a size of the output from the query and a plurality of factors specific to an implementation of the database system. 6. The method of claim 5, wherein said input size of data to be queried consists of a size of the XML document to be queried or an output size of an evaluated query containing an XPath expression. 7. The method of claim 5 wherein CPU cost is a sum of a first product and a second product, wherein the first product is a product of a factor specific to the database system implementation and the input size of data to be queried; and the second product is a product of a second factor specific to the database system implementation and the size of output from the query. 8. The method of claim 5, wherein CPU cost in a query with multiple XPath expressions is the sum of a first product, a second product, a third product, and an Nth product, wherein the first product is a product of a first factor specific to the database system implementation and the input size of data to be queried; the second product is a product of a second factor specific to the database system implementation and the size of output from the first query; the third product is a product of a third factor specific to the database system implementation and the size of output from the second query; and the Nth product is a product of an Nth factor specific to the database system implementation and the size of output from the (N-1) query. 9. The method of claim 4 wherein the I/O cost is determined by computing an input size of data to be queried divided by a size of the data block used by the database system to read and write data. 10. The method of claim 1, wherein the XML documents are stored in binary form in the database. 11. The method of claim 1, wherein the XML documents are stored in text form in the database. 12. The method of claim 1, wherein the XML documents are stored in object relational form in the database. 13. The method of claim 4 wherein Total cost is a sum of a first product and a second product, wherein the first product is a product of a factor specific to the database system implementation and the estimated CPU cost; and the second product is a product of a second factor specific to the database system implementation and the estimated I/O cost. 14. A computer-readable volatile or non-volatile storage medium storing one or more sequences of instructions which, when executed by one or more processors, causes the one or more processors to perform: maintaining a plurality of statistics about nodes in said XML documents; based upon said plurality of statistics, estimating a cost for computing at least one path expression in said query on said XML documents, said cost comprising an estimated CPU cost and an estimated I/O cost; wherein the cost of computing the at least one path expression is determined based on a mathematical function of the estimated CPU cost and the estimated I/O cost; wherein computing said at least one path expression is performed using streaming evaluation; wherein estimating a cost for computing a path expression of the at least one path expression includes: estimating an input-size of said XML documents, said input-size being based on units of bytes; based on a portion of said plurality of statistics about said nodes, estimating an output-size associated with said path expression. 15. The computer-readable volatile or non-volatile storage medium of claim 14, wherein said statistics are maintained upon receipt of a command to gather statistics for the database system. 16. The computer-readable volatile or non-volatile storage medium of claim 14, wherein said statistics are stored in an XML structural summary of said XML documents, with annotations that contain statistics about each node in said XML structural summary. 17. The computer-readable volatile or non-volatile storage medium of claim 14, wherein the cost of computing the path expression is the weighted sum of the estimated CPU cost and the estimated I/O cost. 18. The computer-readable volatile or non-volatile storage medium of claim 17, wherein said estimated CPU cost is computed with an input size of data to be queried, a size of the output from the query and a plurality of factors specific to an implementation of the database system. 19. The computer-readable volatile or non-volatile storage medium of claim 18, wherein said input size of data to be queried consists of a size of the XML document to be queried or an output size of an evaluated query containing an XPath expression. 20. The computer-readable volatile or non-volatile storage medium of claim 18, wherein CPU cost is a sum of a first product and a second product, wherein the first product is a product of a factor specific to the database system implementation and the input size of data to be queried; and the second product is a product of a second factor specific to the database system implementation and the size of output from the query. 21. The computer-readable volatile or non-volatile storage medium of claim 18, wherein CPU cost in a query with multiple XPath expressions is the sum of a first product, a second product, a third product, and an Nth product, wherein the first product is a product of a first factor specific to the database system implementation and the input size of data to be queried; the second product is a product of a second factor specific to the database system implementation and the size of output from the first query; the third product is a product of a third factor specific to the database system implementation and the size of output from the second query; and the Nth product is a product of an Nth factor specific to the database system implementation and the size of output from the (N-1) query. 22. The computer-readable volatile or non-volatile storage medium of claim 17, wherein the I/O cost is determined by computing an input size of data to be queried divided by a size of the data block used by the database system to read and write data. 23. The computer-readable volatile or non-volatile storage medium of claim 14, wherein the XML documents are stored in binary form in the database. 24. The computer-readable volatile or non-volatile storage medium of claim 14, wherein the XML documents are stored in text form in the database. 25. The computer-readable volatile or non-volatile storage medium of claim 14, wherein the XML documents are stored in object relational form in the database. 26. The computer-readable volatile or non-volatile storage medium of claim 17, wherein Total cost is a sum of a first product and a second product, wherein the first product is a product of a factor specific to the database system implementation and the estimated CPU cost; and the second product is a product of a second factor specific to the database system implementation and the estimated I/O cost.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.