IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0138068
(2002-05-01)
|
등록번호 |
US-7392199
(2008-06-24)
|
발명자
/ 주소 |
- Karlov,Valeri I.
- Kasten,Bernard
- Padilla,Carlos E.
- Maggio,Edward T.
- Billingsley,Frank
|
출원인 / 주소 |
- Quest Diagnostics Investments Incorporated
- Perseus Soros Pharmaceutical Fund, LLP
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
31 인용 특허 :
7 |
초록
▼
A system and method of diagnosing diseases from biological data is disclosed. A system for automated disease diagnostics prediction can be generated using a database of clinical test data. The diagnostics prediction can also be used to develop screening tests to screen for one or more inapparent dis
A system and method of diagnosing diseases from biological data is disclosed. A system for automated disease diagnostics prediction can be generated using a database of clinical test data. The diagnostics prediction can also be used to develop screening tests to screen for one or more inapparent diseases. The prediction method can be implemented with Bayesian probability estimation techniques. The system and method permit clinical test data to be analyzed and mined for improved disease diagnosis.
대표청구항
▼
What is claimed is: 1. A method of processing test data, comprising: determining an estimate for one or more hypothesis-conditional probability density functions p(x|Hk) for a set X of the test data conditioned on a set H of hypotheses relating to the test data; determining a set of prior probabili
What is claimed is: 1. A method of processing test data, comprising: determining an estimate for one or more hypothesis-conditional probability density functions p(x|Hk) for a set X of the test data conditioned on a set H of hypotheses relating to the test data; determining a set of prior probability density functions p(Hk) for each hypothesis of the set H; determining a set of posterior test-conditional probability density functions p(Hk|x) for the hypotheses conditioned on a new data x; and outputting a diagnostic result based on the set of posterior test-conditional probability density functions; wherein the p(x|Hi) estimates include a global estimate produced in accordance with uncertainties due to finite samples in the statistical characteristics of the test data relating to each hypothesis-conditional pdf p(x|Hk). 2. A method as defined in claim 1, wherein the uncertainties in the statistical characteristics are specified as an ellipsoid about the test data for each hypothesis and each ellipsoid is defined by an m-dimensional ellipsoid Eq,k for each hypothesis Hk and is specified by: where the m��1 vector x is the argument in the space of test data, the m��1 vector mx,k is the mean (center) of each ellipsoid, the m��m matrix Px,k is a covariance matrix of the ellipsoid, and the scalar defines the size of the q-th ellipsoid, such that the global estimate of the hypothesis-conditional pdf is specified by: description="In-line Formulae" end="lead"{circumflex over (P)}glob(x|Hk)=αq,k if x∈Eq,k∩Eq-1,k(E0,k =El,k), k=1, . . .,N description="In-line Formulae" end="tail" for a selected confidence interval parameter αq,k. 3. A method as defined in claim 1, wherein the hypothesis-conditional p(x|Hk) estimates further include a local estimate produced in accordance with a discrete neighbor counting process for a test data relative to the global estimate for the corresponding hypothesis-conditional pdf. 4. A method as defined in claim 3, wherein the local estimate for a hypothesis is specified as a probability that an observed vector of tests x and an associated discrete neighbor counting pattern {Cl,k(x)},l=1, . . . , Lk, k=1. . . ,N might actually be observed, wherein the neighbor counting pattern comprises counting neighbors in the distance layers for each class: {Cl,k}, l=1, . . . ,Lk, wherein the integer Cl,k is the number of neighbors associated with the k-th hypothesis whose test values are distanced from a next test value within the l-th globally-transformed distance layer for the k-th class: where nk is the total number of data records in a selected k-th class and the index i runs over all these data records. 5. A method as defined in claim 4, wherein the selected k-th class of the test data corresponds to a selected training subset class of the test data. 6. A method as defined in claim 1, further including: performing a training mode in which a training subset class of the test data is used to produce the hypothesis-conditional probability density functions p(x|Hk); and performing a prediction mode in which a set of posterior probabilities is determined for the set H of hypotheses, wherein the hypothesis-conditional probability density functions p(x|Hk) are produced from the global estimates and from local estimates produced in accordance with a discrete neighbor counting process for a test data relative to the global estimate for the corresponding hypothesis-conditional pdf. 7. A method as defined in claim 6, wherein the local estimate for a hypothesis is specified as a probability that an observed vector of tests x and an associated discrete neighbor counting pattern {Cl,k(x)}, l=1, . . . ,Lk, k=1, . . . ,N might actually be observed, wherein the neighbor counting pattern comprises counting neighbors in the distance layers for each class: {Cl,k}, l=1, . . . ,Lk, wherein the integer Cl,k is the number of test elements associated with the k-th hypothesis whose test values are distanced from a next test value within the l-th globally-transformed distance layer for the k-th class: where nk is the total number of data records in a selected k-th class and the index i runs over all these data records. 8. A method as defined in claim 7, wherein the selected k-th class of the test data corresponds to the training subset class of the test data. 9. The method of claim 6, wherein the data comprises biochemical data from a subject. 10. The method of claim 6, wherein the data comprises medical history data from a subject. 11. The method of claim 6, wherein the data comprises physiological data from a subject. 12. The method of claim 6, wherein the data comprises clinical data from a subject. 13. The method of claim 1, wherein the posterior test-condition probabilities provide a diagnosis or risk of developing a disease or diseases. 14. The method of claim 1, wherein the data comprises biochemical data from a subject. 15. The method of claim 1, wherein the data comprises medical history data from a subject. 16. The method of claim 1, wherein the data comprises physiological data from a subject. 17. The method of claim 1, wherein the data comprises clinical data from a subject. 18. The method of claim 1, wherein the diseases are selected from the group consisting of cardiovascular diseases, diabetes, neurodegenerative diseases, malignancies, ophthalmic diseases, blood diseases, respiratory diseases, endocrine diseases, bacterial, parasitic, fungal or viral infections, inflammatory diseases, autoimmune diseases, reproductive diseases. 19. A method for generating an a posteriori tree of possible diagnoses for a subject, the method comprising: performing an analysis of test data for a population of individuals to whom a set of tests were administered comprising a matrix of pair-wise discriminations between diagnoses from a predetermined list of diagnoses; performing a Bayesian statistical analysis to estimate a series of hypothesis-conditional probability density functions p(x)Hi)where a hypothesis Hi is one of a set H of the possible diagnoses; determining a prior probability density function p(Hi) for each of the disease hypotheses Hi; determining a posterior test-conditional probability density function p(Hi|x) for each of the hypotheses Hi test data records; and generating a posterior tree of possible diagnoses for a test subject in accordance with test results for the test subject; and outputting the posterior tree for the test subject. 20. A method of diagnosing a disease condition of a patient, the method comprising: receiving a set of population test data comprising test results for one or more patient tests performed on a population X of individuals; estimating a hypothesis-conditional probability density function p(x|H1) where the hypothesis H1 relates to a diagnosis condition for a test patient x, and estimating a hypothesis-conditional probability density function p(x|H2) where the hypothesis H2 relates to a non-diagnosis condition for a test patient; determining a prior probability density function p(H) for the each of the hypotheses H1 and H2; determining a posterior test-conditional probability density function p(H|x) for each of the hypotheses H1 and H2 on the test data x; and providing a diagnosis probability of a new patient for the H disease condition, based on the determined posterior test-conditional probability density function p(H1|x) as compared to the posterior test-conditional probability density function p(H2|x) and one or more test results of the new patient; and outputting the diagnosis probability of the new patient. 21. The method of claim 20, wherein the data comprises biochemical data from a subject. 22. The method of claim 20, wherein the data comprises medical history data from a subject. 23. The method of claim 20, wherein the data comprises physiological data from a subject. 24. The method of claim 20, wherein the data comprises clinical data from a subject. 25. The method of claim 20, wherein the diseases are selected from the group consisting of cardiovascular diseases, diabetes, neurodegenerative diseases, malignancies, ophthalmic diseases, blood diseases, respiratory diseases, endocrine diseases, bacterial, parasitic, fungal or viral infections, inflammatory diseases, autoimmune diseases, reproductive diseases. 26. The method of claim 20, wherein the diseases are selected from the group consisting of cancers. 27. A program product for use in a computer that executes program steps recorded in a computer-readable media to perform a method of processing test data, the program product comprising: a recordable media; a plurality of computer-readable instructions executable by the computer to perform a method comprising: determining an estimate for one or more hypothesis-conditional probability density functions p(x|Hk) for a set X of the test data conditioned on a set H of hypotheses relating to the test data; determining a set of prior probability density functions p(Hk) for each hypothesis of the set H; and outputting a diagnostic result based on the set of posterior test-conditional probability density functions determining a set of posterior test-conditional probability density functions p(Hk|x) for the hypotheses conditioned on a new data x; wherein the p(x|Hi) estimates include a global estimate produced in accordance with uncertainties due to finite samples in the statistical characteristics of the test data relating to each hypothesis-conditional pdf p(x|Hk). 28. A program product as defined in claim 27, wherein the uncertainties in the statistical characteristics are specified as an ellipsoid about the test data for each hypothesis and each ellipsoid is defined by an m-dimensional ellipsoid Eq,k for each hypothesis Hk and is specified by: where the m��1 vector x is the argument in the space of test data, the m��1 vector mx,k is the mean (center) of each ellipsoid, the m��m matrix Px,k is a covariance matrix of the ellipsoid, and the scalar μq,k2 defines the size of the q-th ellipsoid, such that the global estimate of the hypothesis-conditional pdf is specified by: description="In-line Formulae" end="lead"{circumflex over (P)}glob(x|Hk)=αq,k if x∈Eq,k∩Eq-l,k(E0,k =E1,k), k=1, . . . ,N description="In-line Formulae" end="tail" for a selected confidence interval parameter αq,k. 29. A program product as defined in claim 27, wherein the hypothesis-conditional p(x|Hk) estimates further include a local estimate produced in accordance with a discrete neighbor counting process for a test data relative to the global estimate for the corresponding hypothesis-conditional pdf. 30. A program product as defined in claim 29, wherein the local estimate for a hypothesis is specified as a probability that an observed vector of tests x and an associated discrete neighbor counting pattern {Cl,k(x)}, l=1, . . . ,Lk, k=1, . . . ,N might actually be observed, wherein the neighbor counting pattern comprises counting neighbors in the distance layers for each class: {Cl,k}, l=1, . . . ,Lk, wherein the integer Cl,k is the number of neighbors associated with the k-th hypothesis whose test values are distanced from a next test value within the l-th globally-transformed distance layer for the k-th class: where nk is the total number of data records in a selected k-th class and the index i runs over all these data records. 31. A program product as defined in claim 30, wherein the selected k-th class of the test data corresponds to a selected training subset class of the test data. 32. A program product as defined in claim 27, further including: performing a training mode in which a training subset class of the test data is used to produce the hypothesis-conditional probability density functions p(x|Hk); and performing a prediction mode in which a set of posterior probabilities is determined for the set H of hypotheses, wherein the hypothesis-conditional probability density functions p(x|Hk) are produced from the global estimates and from local estimates produced in accordance with a discrete neighbor counting process for a test data relative to the global estimate for the corresponding hypothesis-conditional pdf. 33. A program product as defined in claim 32, wherein the local estimate for a hypothesis is specified as a probability that an observed vector of tests x and an associated discrete neighbor counting pattern {Cl,k(x)}, l=1, . . . ,Lk, k=1, . . . ,N might actually be observed, wherein the neighbor counting pattern comprises counting neighbors in the distance layers for each class: {Cl,k}, l=1, . . . ,Lk, wherein the integer Cl,k is the number of test elements associated with the k-th hypothesis whose test values are distanced from a next test value within the l-th globally-transformed distance layer for the k-th class: where nk is the total number of data records in a selected k-th class and the index i runs over all these data records. 34. A program product as defined in claim 33, wherein the selected k-th class of the test data corresponds to the training subset class of the test data. 35. The program product of claim 32, wherein the data comprise biochemical data from a subject. 36. The program product of claim 32, wherein the data comprise medical history data from a subject. 37. The program product of claim 32, wherein the data comprise physiological data from a subject. 38. The program product of claim 32, wherein the data comprise clinical data from a subject. 39. The program product of claim 27, wherein the diseases are selected from the group consisting of cardiovascular diseases, diabetes, neurodegenerative diseases, malignancies, ophthalmic diseases, blood diseases, respiratory diseases, endocrine diseases, bacterial, parasitic, fungal or viral infections, inflammatory diseases, autoimmune diseases, and reproductive diseases. 40. The program product of claim 27, wherein the posterior test-condition probabilities provide a diagnosis or risk of developing a disease or diseases. 41. The program product of claim 27, wherein the data comprise biochemical data from a subject. 42. The program product of claim 27, wherein the data comprise medical history data from a subject. 43. The program product of claim 27, wherein the data comprise physiological data from a subject. 44. The program product of claim 27, wherein the data comprise clinical data from a subject.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.