Method and system for fast, generic, online and offline, multi-source text analysis and visualization
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-007/00
출원번호
UP-0023693
(2008-01-31)
등록번호
US-7792816
(2010-09-27)
발명자
/ 주소
Funes, Pablo
Popovici, Elena
Gaudiano, Paolo
Buchsbaum, Daphna
Garagic, Denis
Ecemis, M. Ihsan
Bingham, Chris
Bonabeau, Eric
출원인 / 주소
Icosystem Corporation
대리인 / 주소
Foley Hoag LLP
인용정보
피인용 횟수 :
10인용 특허 :
119
초록▼
Methods and systems for text data analysis and visualization enable a user to specify a set of text data sources and visualize the content of the text data sources in an overview of salient features in the form of a network of words. A user may focus on one or more words to provide a visualization o
Methods and systems for text data analysis and visualization enable a user to specify a set of text data sources and visualize the content of the text data sources in an overview of salient features in the form of a network of words. A user may focus on one or more words to provide a visualization of connections specific to the focused word(s). The visualization may include clustering of relevant concepts within the network of words. Upon selection of a word, the context thereof, e.g., links to articles where the word appears, may be provided to the user. Analyzing may include textual statistical correlation models for assigning weights to words and links between words. Displaying the network of words may include a force-based network layout algorithm. Extracting clusters for display may include identifying “communities of words” as if the network of words was a social network.
대표청구항▼
The invention claimed is: 1. In a computer system having at least one user interface including at least one output device and at least one input device, a method comprising: a) receiving from a user through at least one input device an identification of at least one text source; b) from each said i
The invention claimed is: 1. In a computer system having at least one user interface including at least one output device and at least one input device, a method comprising: a) receiving from a user through at least one input device an identification of at least one text source; b) from each said identified text source, retrieving at least one text passage; c) for each said retrieved text passage, parsing the said passage into words, identifying multi-word expressions in the said passage and applying a stemming algorithm to the said passage; d) for each word from the said text passages, determining a number of times the said word appears in the said passages; and e) causing to be displayed on an output device a predetermined number of words from the said text passages, wherein distances between the said predetermined number of words in a display on the said output device are determined at least in part by a word weight for each said displayed word and by a link weight for each pair of said displayed words, and wherein the word weight for each said displayed word is determined at least in part by a number of times the said word appears in the said passages; and wherein the link weight for each said pair of said displayed words is determined at least in part by the number of times each said word appears in the said passages and by a number of times the said word pair appears in a same window in the said passages; and wherein the method, further comprising receiving the said predetermined number from a user through at least one input device; and wherein the method further comprising f) receiving from a user an instruction to delete at least one word from the said display; and g) causing to be displayed on an output device the predetermined number of words from the said text passages without the at least one word which the said user instructed to be deleted; wherein distances between the said displayed words are determined at least in part by the word weight for each said displayed word and by the link weight for each pair of said displayed words, and wherein the word weight for each said displayed word is determined at least in part by a number of times the said word appears in the said passages; and wherein the link weight for each said pair of said displayed words is determined at least in part by the number of times each said word appears in the said passages and by a number of times the said word pair appears in a same window in the said passages. 2. In the computer system of claim 1, the method, further comprising, for each said retrieved text passage, removing at least one stop word from the said passage. 3. In the computer system of claim 1, the method, wherein the said same window is a sentence. 4. In the computer system of claim 1, the method, wherein the said same window is a paragraph. 5. A computer-readable medium having computer-readable instructions stored thereon which, as a result of being executed in a computer system having at least one user interface including at least one output device and at least one input device, instruct the computer system to perform a method, comprising: a) receiving from a user through at least one input device an identification of at least one text source; b) from each said identified text source, retrieving at least one text passage; c) for each said retrieved text passage, parsing the said passage into words, identifying multi-word expressions in the said passage and applying a stemming algorithm to the said passage; d) for each word from the said text passages, determining a number of times the said word appears in the said passages; and e) causing to be displayed on an output device a predetermined number of words from the said text passages, wherein distances between the said predetermined number of words in a display on the said output device are determined at least in part by a word weight for each said displayed word and by a link weight for each pair of said displayed words, and wherein the word weight for each said displayed word is determined at least in part by a number of times the said word appears in the said passages; and wherein the link weight for each said pair of said displayed words is determined at least in part by the number of times each said word appears in the said passages and by a number of times the said word pair appears in a same window in the said passages; and wherein the said instructions instruct the said computer system to perform the said method, further comprising, receiving the said predetermined number from a user through at least one input device; and wherein the said instructions instruct the said computer system to perform the said method, further comprising, f) receiving from a user an instruction to delete at least one word from the said display; and g) causing to be displayed on an output device the predetermined number of words from the said text passages without the at least one word which the said user instructed to be deleted; wherein distances between the said displayed words are determined at least in part by the word weight for each said displayed word and by the link weight for each pair of said displayed words, and wherein the word weight for each said displayed word is determined at least in part by a number of times the said word appears in the said passages; and wherein the link weight for each said pair of said displayed words is determined at least in part by the number of times each said word appears in the said passages and by a number of times the said word pair appears in a same window in the said passages. 6. The computer-readable medium of claim 5, wherein the said instructions instruct the said computer system to perform the said method, further comprising, for each said retrieved text passage, removing at least one stop word from the said passage. 7. The computer-readable medium of claim 5, wherein the said same window is a sentence. 8. The computer-readable medium of claim 5, wherein the said same window is a paragraph.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (119)
Lyle, Ryan T.; Waugh, David C.; Dulaney, James W.; Andress, Jeffrey D., Adaptive index reference position qualification.
Paulo S. Tubel ; Lynn B. Hales ; Randy A. Ynchausti ; Donald G. Foot, Jr., Application of adaptive object-oriented optimization software to an automatic optimization oilfield hydrocarbon production management system.
Merat Francis L. (University Heights OH) Roumina Kavous (Westlake OH) Ruegsegger Steven M. (Centerville OH) Delvalle Robert B. (Cleveland Heights OH), Automated process planning for quality control inspection.
Datta,Deepshikha; Wang,Pin; Carrico,Isaac; Mayo,Stephen L.; Tirrell,David, Computational method for designing enzymes for incorporation of non natural amino acids into proteins.
Choi, Lawrence J.; Kuenne, Christopher B.; Holstein, II, Kurt E., Computer-assisted systems and methods for determining effectiveness of survey question.
Kaji, Hirotaka; Yamaguchi, Masashi; Harada, Hiroshi; Matsushita, Yukio, Control system of optimizing the function of machine assembly using GA-Fuzzy inference.
Jeffrey J. Garside ; Stephen Monfre ; Barry C. Elliott ; Timothy L. Ruchti ; Glenn Aaron Kees ; Frank S. Grochocki, Fiber optic illumination and detection patterns, shapes, and locations for use in spectroscopic analysis.
Shackleford J. Barry,JPX ; Okushi Etsuko,JPX ; Yasuda Mitsuhiro,JPX ; Iwamoto Takashi,JPX, Genetic algorithm machine and its production method, and method for executing a genetic algorithm.
Gounares Alexander G. ; Spady Stephen W., Method and apparatus for adaptively solving sequential problems in a target system utilizing evolutionary computation techniques.
McCann Paul H. ; Alose Gary L. ; Chavez Javier E. ; Dawson Scott M. ; Brayton Robert S. ; Hiles Paul E., Method and apparatus for an incremental editor technology.
Choi, Lawrence J.; Kuenne, Christopher B.; Holstein, II, Kurt E.; Cross, Henry Andrew; Tang, George; Bansal, Chetna; Whitney, Jason R.; Babbitt, Joshua D., Method and system for clustering optimization and applications.
Steitz, Thomas A.; Moore, Peter B.; Ippolito, Joseph A.; Ban, Nenad; Nissen, Poul; Hansen, Jeffrey L., Method of identifying molecules that bind to the large ribosomal subunit.
Bonabeau,Eric; Anderson,Carl; Scott,John M.; Budynek,Julien; Malinchik,Sergey, Methods and systems for applying genetic operators to determine system conditions.
Steitz, Thomas A.; Moore, Peter B.; Ban, Nenad; Nissen, Poul; Hansen, Jeffrey; Ippolito, Joseph A., Modulators of ribosomal function and identification thereof.
Koza John R. (25372 La Rena La. Los Altos Hills CA 94022), Non-linear genetic algorithms for solving problems by finding a fit composition of functions.
Koza John R. (25372 La Rena La. Los Altos Hills CA 94022) Rice James P. (Redwood City CA), Non-linear genetic process for use with plural co-evolving populations.
Linda Nolan Keyes ; Stephen Kohl Doberstein ; Andrew Roy Buchman ; Bindu Priya Reddy ; David Andrew Ruddy, Nucleic acids and proteins of D. melanogaster insulin-like genes and uses thereof.
Katsof Barry (176 Highfield Ave. Town of Mount Royal ; Quebec CAX H3P 1C8) Waxman Ronald G. (73 Manuel Drive Dollard-des-Ormeaux ; Quebec CAX) Matlin Joel (3922 Chesswood Drive Downsview ; Ontario CA, System and method for forecasting bank traffic and scheduling work assignments for bank personnel.
Swathibabu Gabbita ; Brandon Goldfedder ; Casey K. Hopson ; Robert E. Park ; Dennis Troup, System and method for managing the workflow for processing service orders among a variety of organizations within a telecommunications company.
Ulyanov, Sergei V.; Panfilov, Sergei; Takahashi, Kazuki, System and method for nonlinear dynamic control based on soft computing with discrete constraints.
Gabbita, Swathibabu; Goldfedder, Brandon; Hopson, Casey K.; Troup, Dennis; Park, Robert E., System and method for processing and tracking telecommunications service orders.
Elad Joseph B. (Claymont DE) Johnson Apperson H. (Wilmington DE) Kramer Laurence A. (North East MD) Kirk Jeffrey C. (Newtown Square PA) Philips Irene H. (New Castle DE) Zickus Susan M. (Wilmington DE, System and method for representing and solving numeric and symbolic problems.
Elad Joseph B. (Claymont DE) Johnson Apperson H. (Wilmington DE) Kramer Laurence A. (North East MD) Kirk Jeffrey C. (Newtown Square PA) Philips Irene H. (New Castle DE) Zickus Susan M. (Wilmington DE, System and method for representing and solving numeric and symbolic problems.
Subbu, Raj; Sanderson, Arthur; Graves, Robert, System and method for time-efficient distributed search and decision-making using cooperative co-evolutionary algorithms executing in a distributed multi-agent architecture.
Fertik, Michael Benjamin Selkowe; Scott, Tony; Dignan, Thomas, Identifying information related to a particular entity from electronic sources, using dimensional reduction and quantum clustering.
Fertik, Michael Benjamin Selkowe; Scott, Tony; Dignan, Thomas, Identifying information related to a particular entity from electronic sources, using dimensional reduction and quantum clustering.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.