[특허]Method and system of ranking and clustering for document indexing and retrieval

Method and system of ranking and clustering for document indexing and retrieval 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-007/00
출원번호	US-0724170 (2003-12-01)
등록번호	US-7496561 (2009-02-24)
발명자 / 주소	Caudill,Maureen Tseng,Jason Chun Ming Wang,Lei
출원인 / 주소	Science Applications International Corporation
대리인 / 주소	Banner & Witcoff, Ltd.
인용정보	피인용 횟수 : 22 인용 특허 : 62

초록 ▼

A relevancy ranking and clustering method and system that determines the relevance of a document relative to a user's query using a similarity comparison process. Input queries are parsed into one or more query predicate structures using an ontological parser. The ontological parser parses a set of known documents to generate one or more document predicate structures. A comparison of each query predicate structure with each document predicate structure is performed to determine a matching degree, represented by a real number. A multilevel modifier strategy is implemented to assign different relevance values to the different parts of each predicate structure match to calculate the predicate structure's matching degree. The relevance of a document to a user's query is determined by calculating a similarity coefficient, based on the structures of each pair of query predicates and document predicates. Documents are autonomously clustered using a self-organizing neural network that provides a coordinate system that makes judgments in a non-subjective fashion.

대표청구항 ▼

What is claimed is: 1. One or more computer readable media storing computer executable instructions to perform a method for vectorizing a set of document predicate structures, the method comprising: identifying at least one predicate and argument in said set of document predicate structures by a predicate key that is an integer representation; estimating conceptual nearness of two of said document predicate structures in said set of document predicate structures by subtracting corresponding ones of said predicate keys; and outputting at least one document based upon the estimated conceptual nearness. 2. The computer readable media of claim 1, the method further comprising constructing multi-dimensional vectors using said integer representation. 3. The computer readable media of claim 2, the method further comprising normalizing said multi-dimensional vectors. 4. The computer readable media of claim 3, the method further comprising identifying at least one query predicate structure by a second predicate key that is a second integer representation, and constructing second multi-dimensional vectors, for said at least one query predicate structure, using said second integer representation. 5. The computer readable media of claim 1, the method further comprising identifying at least one query predicate structure by a second predicate key that is a second integer representation, and constructing second multi-dimensional vectors, for said at least one query predicate structure, using said second integer representation. 6. The computer readable media of claim 1, wherein said set of document predicate structures are representations of logical relationships between words in a sentence. 7. The computer readable media of claim 1, wherein each of said document predicate structures in said set includes a predicate and a set of arguments, wherein the predicate is one of a verb and a preposition. 8. One or more computer readable media storing computer executable instructions to perform a method for vectorizing a set of document predicate structures, the method comprising: identifying at least one predicate in said set of document predicate structures by a predicate key that is an integer representation; estimating conceptual nearness of two of said document predicate structures in said set of document predicate structures by subtracting corresponding ones of said predicate keys; and outputting at least one document based upon the estimated conceptual nearness. 9. The computer readable media of claim 8, the method further comprising constructing multi-dimensional vectors using said integer representation. 10. The computer readable media of claim 9, the method further comprising normalizing said multi-dimensional vectors. 11. The computer readable media of claim 10, the method further comprising identifying at least one query predicate structure by a second predicate key that is a second integer representation, and constructing second multi-dimensional vectors, for said at least one query predicate structure, using said second integer representation. 12. The computer readable media of claim 8, the method further comprising identifying at least one query predicate structure by a second predicate key that is a second integer representation, and constructing second multi-dimensional vectors, for said at least one query predicate structure, using said second integer representation. 13. The computer readable media of claim 8, wherein said set of document predicate structures are representations of logical relationships between words in a sentence. 14. One or more computer readable media storing computer executable instructions to perform a method for constructing multi-dimensional vector representations for each document of a set of documents, the method comprising: determining each predicate structure of one or more predicate structures M in each document of the set of documents, said M predicate structures including a predicate and at least one argument; identifying the predicate and the at least one argument in each of said M predicate structures by a predicate key that is an integer representation; determining a fixed number of arguments q for vector construction; constructing an N-dimensional vector representation of each document based upon the predicate and q arguments; and outputting at least one document of the set of documents based upon the constructed N-dimensional vector representation of the at least one document, wherein any predicate structure of said M predicate structures that includes less than q arguments fills unfilled argument positions with a numerical zero. 15. The computer readable media of claim 14, wherein any predicate structure of said M predicate structures that includes more than q arguments omits remaining arguments after q argument positions are filled. 16. The computer readable media of claim 15, wherein conceptual nearness of two of said N-dimensional vector representations is estimated by subtracting corresponding ones of said predicate keys. 17. The computer readable media of claim 15, the method further comprising normalizing said N-dimensional vector representations. 18. The computer readable media of claim 14, wherein conceptual nearness of two of said N-dimensional vector representations is estimated by subtracting corresponding ones of said predicate keys. 19. The computer readable media of claim 14, the method further comprising normalizing said N-dimensional vector representations.

이 특허에 인용된 특허 (62)

Chang, Shih-Chio; Chow, Anita; Du, Min-Wen, Adaptive ranking system for information retrieval.
상세보기
Braden-Harder Lisa ; Corston Simon H. ; Dolan William B. ; Vanderwende Lucy H., Apparatus and methods for an information retrieval system that employs natural language processing of search results to.
상세보기
Asija Satya P. (1641 Cumberland St. ; Apt. 21 St. Paul MN 55117), Automated information input, storage, and retrieval system.
상세보기
Hemphill Charles T. (Coppell TX) Picone Joseph W. (Plano TX), Chart parser for stochastic unification grammar.
상세보기
Jensen Karen (Rockville MD), Computer method for identifying predicate-argument structures in natural language text.
상세보기
Pant Sangam ; Andre David L. ; Watson Gray ; Green Richard M. ; Schiegg Michael J., Computer system with user-controlled relevance ranking of search results.
상세보기
Wical Kelly, Concept knowledge base search and retrieval system.
상세보기
Turtle Howard R. (Woodbury MN), Concept matching of natural language queries with a database of document concepts.
상세보기
Wical Kelly (Redwood Shores CA), Content processing system for discourse.
상세보기
Wachtel Thomas Juliusz,GBX, Control signal processing method and apparatus having natural language interfacing capabilities.
상세보기
Maeda Akira,JPX ; Ashida Hitoshi,JPX ; Taniguchi Yoji,JPX ; Ito Yukiyasu,JPX ; Takahashi Yori,JPX, Data analyzing method and system.
상세보기
Mangat Satwinder S. ; Taylor Wayne ; Mahlum Steven, Document link management using directory services.
상세보기
Ogawa Yasushi (Yokohama JPX), Document retrieval system involving ranking of documents in accordance with a degree to which the documents fulfill a re.
상세보기
Takahashi Kousuke,JPX ; Thornber Karvel K., Document retrieval using fuzzy-logic inference.
상세보기
Bennett James D. ; Jarvis Lawrence M., Down-line transcription system having context sensitive searching capability.
상세보기
Wical Kelly, Information presentation in a knowledge base search and retrieval system.
상세보기
Messerly John J. ; Heidorn George E. ; Richardson Stephen D. ; Dolan William B. ; Jensen Karen, Information retrieval utilizing semantic representation of text.
상세보기
Hazlehurst Brian L. ; Burke Scott M. ; Nybakken Kristopher E., Intelligent query system for automatically indexing information in a database and automatically categorizing users.
상세보기
Ausborn Carolyn (1904 Bluebird Ave. Huntsville AL 35816), Method and apparatus for abstracting concepts from natural language.
상세보기
Anglea Billy W. (Round Rock TX) Cox Robert Charles (Round Rock TX), Method and apparatus for character preprocessing which translates textual description into numeric form for input to a n.
상세보기
Katz Boris (24A Garden St. Cambridge MA 02138) Winston Patrick H. (258 Sudbury Rd. Concord MA 01742), Method and apparatus for generating and utlizing annotations to facilitate computer text retrieval.
상세보기
White Brian F. (Yorktown NY) Bretan Ivan P. (Lidingo SEX) Sanamrad Mohammad A. (Lidingo SEX), Method and apparatus for paraphrasing information contained in logical forms.
상세보기
Lewis David Dolan (Summit NJ), Method and apparatus for training a text classifier.
상세보기
Katz Boris (24A Garden St. Cambridge MA 02138) Winston Patrick H. (88 Monument St. Concord MA 01742), Method and apparatus for utilizing annotations to facilitate computer retrieval of database material.
상세보기
Stuckey Barbara K., Method and device for parsing and analyzing natural language sentences and text.
상세보기
Brash Douglas E., Method and device for parsing natural language sentences and other sequential symbolic expressions.
상세보기
Gallant Stephen I. (49 Fenno St. Cambridge MA 02138), Method for document retrieval and for word sense disambiguation using neural networks.
상세보기
Black ; Jr. James E. (Schenectady NY) Zernik Uri (Schenectady NY), Method for natural language data processing using morphological and part-of-speech information.
상세보기
Burrows Michael, Method for parsing, indexing and searching world-wide-web pages.
상세보기
Agrawal Rakesh ; Chakrabarti Soumen ; Dom Byron Edward ; Raghavan Prabhakar, Multilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values.
상세보기
Liddy Elizabeth D. ; Paik Woojin ; Yu Edmund S. ; Li Ming, Multilingual document retrieval system and method using semantic vector matching.
상세보기
Hedin Erik B. (Lidingo SEX) Jonsson Gregor I. (Lidingo SEX) Olsson Lars E. (Kista SEX) Sanamrad Mohammad A. (Lidingo SEX) Westling Sven O. G. (Stockholm SEX), Natural language analyzing apparatus and method.
상세보기
Liddy Elizabeth D. ; Paik Woojin ; McKenna Mary E. ; Li Ming, Natural language information retrieval system and method.
상세보기
Tokuume Yoshihiro (Machida JPX) Shibata Shogo (Tokyo JPX) Masegi Koichi (Machida JPX), Natural language processing system.
상세보기
Liddy Elizabeth D. ; Paik Woojin ; Yu Edmund Szu-li, Natural language processing system for semantic vector representation which accounts for lexical ambiguity.
상세보기
Dahlgren Kathleen ; Stabler Edward, Natural language understanding system.
상세보기
Loatman Robert B. (Vienna VA) Post Stephen D. (McLean VA) Yang Chih-King (Rockville MD) Hermansen John C. (Catharpin VA), Natural language understanding system.
상세보기
Ruocco Anthony S. ; Frieder Ophir, Parallel document clustering process.
상세보기
Zamora Antonio (Chevy Chase MD) Gunther Michael D. (Gaithersburg MD) Zamora Elena M. (Chevy Chase MD), Parser for natural language text.
상세보기
Ting Hian Ann,SGX, Parsing and translating natural language sentences automatically.
상세보기
Nagase Tomoki (Kawasaki JPX), Parsing system.
상세보기
Driscoll Jim (Orlando FL), Process for determination of text relevancy.
상세보기
Driscoll Jim (Orlando FL), Process for determination of text relevancy.
상세보기
Adar Eytan ; Charity Mitchell N., Randomized query generation and document relevance ranking for robust information retrieval from a database.
상세보기
Kirsch Steven T. ; Chang William I. ; Miller Ed R., Real-time document collection search engine with phrase indexing.
상세보기
Driscoll James R., Relevancy ranking using statistical ranking, semantics, relevancy feedback and small pieces of text.
상세보기
Wical Kelly, Research mode for a knowledge base search and retrieval system.
상세보기
Schultz John Michael, Restricted expansion of query terms using part of speech tagging.
상세보기
De Bellis, Joseph L., Search-on-the-fly/sort-on-the-fly search engine for searching databases.
상세보기
Parry Michael H. ; Aspnes Richard K., Self-organizing neural network for plain text categorization.
상세보기
Kucera Henry (Providence RI) Carus Alwin B. (Newton MA), Sentence analyzer.
상세보기
Mozer Forrest S. ; Mozer Michael C. ; Mozer Todd F., Speech recognition apparatus for consumer electronic applications.
상세보기
Mozer Forrest S. ; Mozer Michael C. ; Mozer Todd F., Speech recognition apparatus for consumer electronic applications.
상세보기
Spencer Graham, System and method for accelerated query evaluation of very large full-text databases.
상세보기
Rose Daniel E. ; Cutting Douglass R., System and method for improving the ranking of information retrieval results for short queries.
상세보기
Caid William R. (San Diego CA) Oing Pu (La Costa CA), System and method of context vector generation and retrieval.
상세보기
Monier Louis M., System for adding new entry to web page table upon receiving web page including link to another web page not having cor.
상세보기
Kaplan Craig A. (Santa Cruz CA) Chen James R. (Saratoga CA) Fallside David C. (San Jose CA) Fenwick Justine R. (Santa Cruz CA) Forcier Mitchell D. (Walnut Creek CA) Wolff Gregory J. (Mountain View CA, System for adjusting hypertext links with weighed user goals and activities.
상세보기
Schabes Yves (Luxembourg MA LUX) Waters Richard C. (Concord MA), System for decreasing the time required to parse a sentence.
상세보기
Herz Frederick S. M. ; Eisner Jason M. ; Ungar Lyle H., System for generation of object profiles for a system for customized electronic identification of desirable objects.
상세보기
Kanaegami Atsushi (Kamakura JPX) Koike Kazuhiro (Kamakura JPX) Taki Hirokazu (Kamakura JPX) Ohgashi Hitoshi (Kamakura JPX), Text search system for locating on the basis of keyword matching and keyword relationship matching.
상세보기
Liddy Elizabeth D. ; Paik Woojin ; McKenna Mary E. ; Weiner Michael L. ; Yu Edmund S. ; Diamond Theodore G. ; Balakrishnan Bhaskaran ; Snyder David L., User interface and other enhancements for natural language information retrieval system and method.
상세보기

이 특허를 인용한 특허 (22)

Takaai, Motoyuki; Sayuda, Hiroyuki, Access right estimation apparatus and non-transitory computer readable medium.
상세보기
De, Sushovan; Singh, Amit K.; Visweswariah, Karthik, Annotating entities using cross-document signals.
상세보기
De, Sushovan; Singh, Amit K.; Visweswariah, Karthik, Annotating entities using cross-document signals.
상세보기
Petriuc, Mihai, Click distance determination.
상세보기
Rennison, Earl, Constructing a search query to execute a contextual personalized search of a knowledge base.
상세보기
Connor, Robert A., Context-driven search.
상세보기
Tankovich, Vladimir; Meyerzon, Dmitriy; Poznanski, Victor, Detection of junk in search result ranking.
상세보기
Tankovich, Vladimir; Meyerzon, Dmitriy; Taylor, Michael James, Document length as a static relevance feature for ranking search results.
상세보기
Meyerzon, Dmitriy; Shnitko, Yauhen; Burges, Chris J. C.; Taylor, Michael James, Enterprise relevancy ranking using a neural network.
상세보기
Aoyama, Kazumi; Sabe, Kohtaro; Shimomura, Hideki, Identifying temporal sequences using a recurrent self organizing map.
상세보기
Rennison, Earl, Learning based on feedback for contextual personalized information retrieval.
상세보기
Rennison, Earl, Learning based on feedback for contextual personalized information retrieval.
상세보기
Lorge, Freddy; Rojahn, Tom O., Ordering search-engine results.
상세보기
Lorge, Freddy; Rojahn, Tom O., Ranking answers to a conceptual query.
상세보기
Poznanski, Victor; Wang, Oivind; Holm, Fredrik; Bodd, Nicolai; Tankovich, Vladimir; Meyerzon, Dmitriy, Re-ranking search results.
상세보기
Yamasaki, Tomohiro; Suzuki, Masaru, Relevancy presentation apparatus, method, and program.
상세보기
Rennison, Earl, Scoring concepts for contextual personalized information retrieval.
상세보기
Tankovich, Vladimir; Li, Hang; Meyerzon, Dmitriy; Xu, Jun, Search results ranking using editing distance and document information.
상세보기
Benson, Gregory P., System and method for context-rich database optimized for processing of concepts.
상세보기
Cormode, Graham; Korn, Philip Russell; Muthukrishnan, Shanmugavelayutham; Srivastava, Divesh, System and method for generating statistical descriptors for a data stream.
상세보기
Merrigan, Chadd Creighton; Peltonen, Kyle G.; Meyerzon, Dmitriy; Lee, David J., System and method for scoping searches using index keys.
상세보기
Rennison, Earl, Using inverted indexes for contextual personalized information retrieval.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Method and system of ranking and clustering for document indexing and retrieval 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (62)

이 특허를 인용한 특허 (22)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Method and system of ranking and clustering for document indexing and retrieval 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (62)

이 특허를 인용한 특허 (22)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트