System and method for the automatic mining of new relationships
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-017/30
G06F-007/00
출원번호
US-0440626
(1999-11-15)
발명자
/ 주소
Sundaresan, Neelakantan
Yi, Jeonghee
출원인 / 주소
International Business Machines Corporation
대리인 / 주소
Kassatly, Samuel A.
인용정보
피인용 횟수 :
96인용 특허 :
14
초록▼
An automatic mining system that identifies a set of relevant terms from a large text database of unstructured information, such as the World Wide Web with a high degree of confidence. The automatic mining system includes a software program that enables the discovery of new relationships by associati
An automatic mining system that identifies a set of relevant terms from a large text database of unstructured information, such as the World Wide Web with a high degree of confidence. The automatic mining system includes a software program that enables the discovery of new relationships by association mining and refinement of co-occurrences, using automatic and iterative recognition of new binary relations through phrases that embody related pairs, by applying lexicographic and statistical techniques to classify the relations, and further by applying a minimal amount of domain knowledge of the relevance of the terms and relations. The automatic mining system includes a knowledge module and a statistics module. The knowledge module is comprised of a stemming unit, a synonym check unit, and a domain knowledge check unit. The stemming unit determines if the relation being analyzed shares a common root with a previously mined relation. The synonym check unit identifies the synonyms of the relation, and the domain knowledge check unit considers extrinsic factors for indications that would further clarify the relationship being mined. The statistics module optimizes the confidence level in the relationship.
대표청구항▼
An automatic mining system that identifies a set of relevant terms from a large text database of unstructured information, such as the World Wide Web with a high degree of confidence. The automatic mining system includes a software program that enables the discovery of new relationships by associati
An automatic mining system that identifies a set of relevant terms from a large text database of unstructured information, such as the World Wide Web with a high degree of confidence. The automatic mining system includes a software program that enables the discovery of new relationships by association mining and refinement of co-occurrences, using automatic and iterative recognition of new binary relations through phrases that embody related pairs, by applying lexicographic and statistical techniques to classify the relations, and further by applying a minimal amount of domain knowledge of the relevance of the terms and relations. The automatic mining system includes a knowledge module and a statistics module. The knowledge module is comprised of a stemming unit, a synonym check unit, and a domain knowledge check unit. The stemming unit determines if the relation being analyzed shares a common root with a previously mined relation. The synonym check unit identifies the synonyms of the relation, and the domain knowledge check unit considers extrinsic factors for indications that would further clarify the relationship being mined. The statistics module optimizes the confidence level in the relationship. etrieved record. 2. A system of contextual searching, comprising: a search engine for searching a set of documents according to a first criterion, to obtain a first set of results; a category lookup engine for defining a subset of the set of documents according to a parameter, to obtain a second set of results; and an intersection engine for defining a third set of results comprising an intersection of the first set of results and the second set of results, wherein the intersection engine comprises a document identifier for, for each document in the first set of results, identifying whether the document exists in the second set of results. 3. A system of contextual searching, comprising: a search engine for searching a set of documents according to a first criterion, to obtain a first set of results; a category lookup engine for defining a subset of the set of documents according to a parameter, to obtain a second set of results; and an intersection engine for defining a third set of results comprising an intersection of the first set of results and the second set of results, the intersection engine comprising a document identifier for, for each document in the first set of results, identifying whether the document exists in the second set of results, the document identifier comprising: a primary hash function application module, for applying a primary hash function to an identifier for the document to obtain a primary hash key for the identifier; a hash bucket identification module, coupled to the primary hash function application module, for identifying a hash bucket having a primary hash key corresponding to the obtain primary hash key, the hash bucket comprising at least one hash entry, each hash entry comprising a secondary hash key and a pointer to a record location in the second set of results; a secondary hash function application module, for applying a secondary hash function to the identifier to obtain a secondary hash key for the identifier; a comparator, coupled to the secondary hash function application module and to the hash bucket identification module, for comparing the secondary hash key for the identifier with the secondary hash key for at least one hash entry in the identified hash bucket; a retrieval module, coupled to the comparison means, for, responsive to the comparison module indicating at least one match, retrieving a record in the second set of results having a location corresponding to the value in the matching hash entry; and a second comparator, coupled to the retrieval module, for comparing the identifier with the retrieved record. 4. A system of contextual searching, comprising: search means, for searching a set of documents according to a first criterion to obtain a first set of results; subset definition means, coupled to the search means, for defining a subset of the set of documents according to a parameter to obtain a second set of results; and result definition means, coupled to the search means and to the subset definition means, for defining a third set of results comprising an intersection of the first set of results and the second set of results. 5. The system of claim 4, wherein: the search means comprises means for performing a text search on the set of documents; and the subset definition means comprises means for defining the subset of the set of documents according to a specified category. 6. The system of claim 4, wherein the result definition means comprises: identifying means for, for each document in the first set of results, identifying whether the document exists in the second set of results. 7. A system of contextual searching, comprising: search means, for searching a set of documents according to a first criterion to obtain a first set of results; subset definition means, coupled to the search means, for defining a subset of the set of documents according to a parameter to obtain a second set of results; and result definitio
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (14)
Marques Joaquin M., Category processing of query topics and electronic document content topics.
Vaithyanathan Shivakumar ; Adler Mark R. ; Hill Christopher G., Computer method and apparatus for clustering documents and automatic generation of cluster keywords.
Paik Woojin ; Liddy Elizabeth D. ; Liddy Jennifer Heverin ; Niles Ian Harcourt ; Allen Eileen E., Information extraction system and method using concept relation concept (CRC) triples.
Paik Woojin ; Liddy Elizabeth D. ; Liddy Jennifer Heverin ; Niles Ian Harcourt ; Allen Eileen E., Information extraction system and method using concept-relation-concept (CRC) triples.
Snow William A. ; Mocker Joseph D., Method and apparatus for classifying documents within a class hierarchy creating term vector, term file and relevance ranking.
Frauenhofer Thomas Valentine ; Marques Joaquin Manuel ; Moran Michael Edward ; Palchowdhury Subhas ; Schaffer Jeffrey Stephen, Method and system for providing access for categorized information from online internet and intranet sources.
Hedin Erik B. (Lidingo SEX) Jonsson Gregor I. (Lidingo SEX) Olsson Lars E. (Kista SEX) Sanamrad Mohammad A. (Lidingo SEX) Westling Sven O. G. (Stockholm SEX), Natural language analyzing apparatus and method.
Ferrari, Adam J.; Gourley, David J.; Johnson, Keith A.; Knabe, Frederick C.; Mohta, Vinay B.; Tunkelang, Daniel; Walter, John S., Hierarchical data-driven search and navigation system and method for information retrieval.
Ferrari,Adam J.; Lau,Andrew M.; Mohta,Vinay B.; Tunkelang,Daniel; Walter,John S., Integrated application for manipulating content in a hierarchical data-driven search and navigation system.
Jung, Edward K. Y.; Levien, Royce A.; Lord, Robert W.; Malamud, Mark A.; Mangione-Smith, William Henry; Rinaldo, Jr., John D., Layering destination-dependent content handling guidance.
Dengler, Patrick M.; Krishnan, Arvind K.; Singh, Jagdish; Sanchez, Lawrence M.; Shankar, Sai; Chittamuru, Satish Kumar; Pekic, Zoltan; Mondal, Nabarun; Kumar, Namendra; i Dalfó, Ricard Roma, Metadata driven user interface.
Zelevinsky, Vladimir V.; Tunkelang, Daniel; Knabe, Frederick C.; Saji, Michael Y.; Tzanov, Velin Krassimirov, Method and system for information retrieval with clustering.
Sweeney, Peter; Good, Robert; Barlow-Busch, Robert; Black, Alexander David, Method, system, and computer program for user-driven dynamic generation of semantic networks and media synthesis.
Mohan, Rengaswamy; Mohan, Usha; Sha, David D., Method, system, and computer program product for concept-based multi-dimensional analysis of unstructured information.
Mohan,Rengaswamy; Mohan,Usha; Sha,David, Method, system, and computer program product for concept-based multi-dimensional analysis of unstructured information.
Sweeney, Peter Joseph; Ilyas, Ihab Francis; Dupuis, Jean-Paul; Yampolska, Nadiya, Methods and apparatus for providing information of interest to one or more users.
Sweeney, Peter Joseph; Ilyas, Ihab Francis; Dupuis, Jean-Paul; Yampolska, Nadiya, Methods and apparatus for providing information of interest to one or more users.
Sweeney, Peter Joseph; Ilyas, Ihab Francis; Dupuis, Jean-Paul; Yampolska, Nadiya, Methods and apparatus for providing information of interest to one or more users.
Gluzman Peregrine, Vladimir; Rosen, Alexander D.; Scarlet, Benjamin S.; Volpe, Andrew, System and method for filtering rules for manipulating search results in a hierarchical search and navigation system.
Ferrari, Adam J.; Knabe, Frederick C.; Mohta, Vinay Seth; Myatt, Jason Paul; Scarlet, Benjamin S.; Tunkelang, Daniel; Walter, John S.; Wang, Joyce; Tucker, Michael, System and method for information retrieval from object collections with complex interrelationships.
Ferrari,Adam J.; Gourley,David J.; Johnson,Keith A.; Knabe,Frederick C.; Mohta,Vinay B.; Tunkelang,Daniel; Walter,John S.; Lau,Andrew, System and method for manipulating content in a hierarchical data-driven search and navigation system.
Menditto,Louis F.; Housel,Barron C.; Tsang,Tzu Ming; Zallocco,Mauro; Shah,Gaurang K.; Vilhuber,Jan; Bhargava,Anurag; Tiwari,Pranav K.; Batz,Robert M.; Brim,Scott W., System and method for processing a request for information in a network.
Sweeney, Peter Joseph; Goodwin, David; Janik-Jones, David, System, method and computer program for creating and manipulating data structures using an interactive graphical interface.
Sweeney, Peter Joseph; Janik-Jones, David; Goodwin, David, System, method and computer program for creating and manipulating data structures using an interactive graphical interface.
Carlson, Robert John; Clarkson, Charles Andrew; Elfayoumy, Sherif A.; Mohan, Rengaswamy; Mohan, Usha; Mukundan, Sripriya, System, method and computer program product for concept-based searching and analysis.
Hunt, Anne Jude; Black, Alexander David; Sweeney, Peter Joseph; Ilyas, Ihab Francis, Systems and methods for analyzing and synthesizing complex knowledge representations.
Hunt, Anne Jude; Black, Alexander David; Sweeney, Peter Joseph; Ilyas, Ihab Francis, Systems and methods for analyzing and synthesizing complex knowledge representations.
Sweeney, Peter; Black, Alexander David, Systems and methods for semantic concept definition and semantic concept relationship synthesis utilizing existing domain definitions.
Sweeney, Peter; Black, Alexander David, Systems and methods for semantic concept definition and semantic concept relationship synthesis utilizing existing domain definitions.
Sweeney, Peter; Black, Alexander David, Systems and methods for semantic concept definition and semantic concept relationship synthesis utilizing existing domain definitions.
Hernandez-Sherrington, Mauricio Antonio; Ho, Ching-Tien; Roth, Mary Ann; Yan, Lingling, Tolerant and extensible discovery of relationships in data using structural information and data analysis.
Jung, Edward K. Y.; Levien, Royce A.; Lord, Robert W.; Malamud, Mark A.; Mangione-Smith, William Henry; Rinaldo, Jr., John D., Using evaluations of tentative message content.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.