IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0429370
(2009-04-24)
|
등록번호 |
US-8250078
(2012-08-21)
|
발명자
/ 주소 |
|
출원인 / 주소 |
- Lexisnexis Risk & Information Analytics Group Inc.
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
4 인용 특허 :
80 |
초록
▼
Disclosed is a system for, and method of, calculating parameters used to determine whether records and entity representations should be linked. The system and method take into consideration interdependent fields, e.g., fields whose constituent field values may be positively or negatively correlated.
Disclosed is a system for, and method of, calculating parameters used to determine whether records and entity representations should be linked. The system and method take into consideration interdependent fields, e.g., fields whose constituent field values may be positively or negatively correlated. The system and method apply iterative techniques such that parameters from each linking iteration are used in the next linking iteration. The system and method need no human interaction in order to calibrate and utilize record matching formulas used for the linking decisions.
대표청구항
▼
1. A computer implemented iterative process for generating entity representations in a computer implemented database using a record matching formula and for generating parameters for the record matching formula, the database comprising a plurality of records, each record comprising a plurality of fi
1. A computer implemented iterative process for generating entity representations in a computer implemented database using a record matching formula and for generating parameters for the record matching formula, the database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field value, wherein at least a portion of the parameters for the record matching formula are specific to a particular plurality of field values associated with a particular plurality of fields, the process comprising: adding, in the database, a supplemental field to each of the plurality of records;populating each supplemental field of each one of the plurality of records with a supplemental field value, each supplemental field value representative of multiple field values from the particular plurality of fields of that record;calculating a plurality of supplemental field value weights, each supplemental field value weight associated with a supplemental field value, each supplemental field value weight reflecting a likelihood that an arbitrary record in the database comprises an associated supplemental field value;forming a plurality of entity representations in the database, at least one entity representation comprising at least two records linked using a first instance of the record matching formula comprising a supplemental field value weight associated with a field value appearing in the supplemental field of at least one of the at least two records;calculating a plurality of revised supplemental field value weights, each revised supplemental field value weight associated with a particular supplemental field value, each revised supplemental field value weight reflecting a likelihood that an arbitrary entity representation in the database comprises an associated supplemental field value;linking at least two entity representations in the database based on a second instance of the record matching formula, wherein the second instance of the record matching formula comprises a revised supplemental field value weight associated with a field value appearing in the supplemental field of at least one of the at least two entity representations, whereby a number of entity representations in the database is reduced by the forming a plurality of linked entity representations; andretrieving information from at least one record in the database. 2. The process of claim 1, further comprising repeating the calculating a plurality of revised supplemental field value weights and the linking at least two entity representations at least once prior to the retrieving. 3. The process of claim 1, wherein the record matching formula comprises a weighted sum of probabilities that two records match. 4. The process of claim 1, wherein each supplemental field value weight comprises a logarithm of a probability and wherein each revised supplemental field value weight comprises a logarithm of a probability. 5. A computer implemented iterative process for generating entity representations in a computer implemented database using a record matching formula and for generating parameters for the record matching formula, the database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field value, wherein at least a portion of the parameters for the record matching formula are specific to a particular plurality of fields, the process comprising: adding, in the database, a supplemental field to each of the plurality of records;populating each supplemental field of each one of the plurality of records with a supplemental field value, each supplemental field value representative of multiple field values from the particularly plurality of fields of that record;calculating a plurality of supplemental field value weights, each supplemental field value weight associated with a supplemental field value, each supplemental field value weight reflecting a likelihood that an arbitrary record in the database comprises an associated supplemental field value;calculating a supplemental field weight, the supplemental field weight derived from each of the plurality of supplemental field value weights;forming a plurality of entity representations in the database, at least one entity representation comprising at least two records linked using a first instance of the record matching formula comprising the supplemental field weight;calculating a plurality of revised supplemental field value weights, each revised supplemental field value weight associated with a particular supplemental field value, each revised supplemental field value weight reflecting a likelihood that an arbitrary entity representation in the database comprises an associated supplemental field value;calculating a revised supplemental field weight, the revised supplemental field weight derived from each of the plurality of revised supplemental field value weights;linking at least two entity representations in the database based on a second instance of the record matching formula, wherein the second instance of the record matching formula comprises the revised supplemental field weight, whereby a number of entity representations in the database is reduced by the forming a plurality of linked entity representations; andretrieving information from at least one record in the database. 6. The process of claim 5 further comprising repeating the calculating a plurality of revised supplemental field value weights, the calculating a revised supplemental field weight, and the linking at least two entity representations at least once prior to the retrieving. 7. The process of claim 5, wherein the record matching formula comprises a weighted sum of probabilities that two records match. 8. The process of claim 5, wherein each supplemental field value weight comprises a logarithm of a probability and wherein each revised supplemental field value weight comprises a logarithm of a probability. 9. A computer system for iteratively generating entity representations in a computer implemented database using a record matching formula and for generating parameters for the record matching formula, the database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field value, wherein at least a portion of the parameters for the record matching formula are specific to a particular plurality of field values associated with a particular plurality of fields, the system comprising: a database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field valuea processor programmed to add, in the database, a supplemental field to each of the plurality of records;a processor programmed to populate each supplemental field of each one of the plurality of records with a supplemental field value, each supplemental field value representative of multiple field values from the particularly plurality of fields of that record;a processor programmed to calculate a plurality of supplemental field value weights, each supplemental field value weight associated with a supplemental field value, each supplemental field value weight reflecting a likelihood that an arbitrary record in the database comprises an associated supplemental field value;a processor programmed to form and store a plurality of entity representations in the database, at least one entity representation comprising at least two records linked using a first instance of the record matching formula comprising a supplemental field value weight associated with a field value appearing in the supplemental field of at least one of the at least two records;a processor programmed to calculate a plurality of revised supplemental field value weights, each revised supplemental field value weight associated with a particular supplemental field value, each revised supplemental field value weight reflecting a likelihood that an arbitrary entity representation in the database comprises an associated supplemental field value; anda processor programmed to link and store at least two entity representations in the database based on a second instance of the record matching formula, wherein the second instance of the record matching formula comprises a revised supplemental field value weight associated with a field value appearing in the supplemental field of at least one of the at least two entity representations, whereby a number of entity representations in the database is reduced by the forming a plurality of linked entity representations. 10. The system of claim 9, further comprising program logic configured to repeat calculating a plurality of revised supplemental field value weights and linking and storing at least two entity representations at least once prior to retrieving information from at least one record in the database. 11. The system of claim 9, wherein the record matching formula comprises a weighted sum of probabilities that two records match. 12. The system of claim 9, wherein each supplemental field value weight comprises a logarithm of a probability and wherein each revised supplemental field value weight comprises a logarithm of a probability. 13. A computer system for iteratively generating entity representations in a computer implemented database using a record matching formula and for generating parameters for the record matching formula, the database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field value, wherein at least a portion of the parameters for the record matching formula are specific to a particular plurality of fields, the system comprising: a database comprising a plurality of records, each record comprising a plurality of fields, each field capable of containing a field valuea processor programmed to add, in the database, a supplemental field to each of the plurality of records;a processor programmed to populate each supplemental field of each one of the plurality of records with a supplemental field value, each supplemental field value representative of multiple field values from the particularly plurality of fields of that record;a processor programmed to calculate a plurality of supplemental field value weights, each supplemental field value weight associated with a supplemental field value, each supplemental field value weight reflecting a likelihood that an arbitrary record in the database comprises an associated supplemental field value;a processor programmed to calculate a supplemental field weight, the supplemental field weight derived from each of the plurality of supplemental field value weights;a processor programmed to form and store a plurality of entity representations in the database, at least one entity representation comprising at least two records linked using a first instance of the record matching formula comprising the supplemental field weight;a processor programmed to calculate a plurality of revised supplemental field value weights, each revised supplemental field value weight associated with a particular supplemental field value, each revised supplemental field value weight reflecting a likelihood that an arbitrary entity representation in the database comprises an associated supplemental field value;a processor programmed to calculate a revised supplemental field weight, the revised supplemental field weight derived from each of the plurality of revised supplemental field value weights; anda processor programmed to link and store at least two entity representations in the database based on a second instance of the record matching formula, wherein the second instance of the record matching formula comprises the revised supplemental field weight, whereby a number of entity representations in the database is reduced by the forming a plurality of linked entity representations. 14. The system of claim 13, further comprising program logic configured to repeat calculating a plurality of revised supplemental field value weights, calculating a revised supplemental field weight, and linking and storing at least two entity representations at least once. 15. The system of claim 13, wherein the record matching formula comprises a weighted sum of probabilities that two records match. 16. The system of claim 13, wherein each supplemental field value weight comprises a logarithm of a probability and wherein each revised supplemental field value weight comprises a logarithm of a probability.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.