[특허]Document processing apparatus, document processing method, document processing program and recording medium

Document processing apparatus, document processing method, document processing program and recording medium 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-017/30
출원번호	US-0143279 (2002-05-10)
우선권정보	JP-P2001-140778(2001-05-10)
발명자 / 주소	Kobayashi,Kenichiro Akabane,Makoto Nitta,Tomoaki Yamazaki,Nobuhide Kobayashi,Erika
출원인 / 주소	Sony Corporation
대리인 / 주소	Wolf, Greenfield &
인용정보	피인용 횟수 : 31 인용 특허 : 9

초록 ▼

The text format of input data is checked, and is converted into a system-manipulated format. It is further determined if the input data is in an HTML or e-mail format using tags, heading information, and the like. The converted data is divided into blocks in a simple manner such that elements in the blocks can be checked based on repetition of predetermined character patterns. Each block section is tagged with a tag indicating a block. The data divided into blocks is parsed based on tags, character patterns, etc., and is structured. A table in text is also parsed, and is segmented into cells. Finally, tree-structured data having a hierarchical structure is generated based on the sentence-structured data. A sentence-extraction template paired with the tree-structured data is used to extract sentences.

대표청구항 ▼

What is claimed is: 1. A document processing apparatus comprising: block dividing means for dividing input document data into blocks in a predetermined manner according to a structure of the document data; document structuring means for structuring the document data, thereby generating structured data, by parsing a block into which the document data is divided by said block dividing means according to the document structure of the block, and by adding tag information to text data constituting the block, said tag information indicating an attribute of the text data; and sentence extraction means for controlling an extraction of the text data according to the structured data and a predetermined condition, wherein the predetermined condition provides an indication of a method to be utilized to perform the extraction of the text data; wherein said document structuring means includes regular-expression determining means which refers to pattern information containing a two-dimensional regular expression for a two-dimensional character string and tag information associated with the regular expression, and which adds the tag information associated with the regular expression in pattern information to a character string in the block that matches the regular expression in the pattern information, before a sentence is extracted; and wherein the two-dimensional regular expression is expressed by two regular expressions, one regular expression indicating a head of a block using a one-dimensional regular expression, and another regular expression indicating a tail of the block using a one-dimensional regular expression, and by a number of lines which is permitted between the two regular expressions. 2. A document processing apparatus according to claim 1, further comprising regular-expression registering means for registering an arbitrary character string as the pattern information containing a two-dimensional regular expression and tag information associated with the two-dimensional regular expression which is used by said regular-expression determining means. 3. A document processing apparatus according to claim 1, wherein the two-dimensional regular expression is expressed by two regular expressions, one regular expression indicating a head of a block using a one-dimensional regular expression, and another regular expression indicating a tail of the block using a one-dimensional regular expression. 4. A document processing apparatus comprising: block dividing means for dividing input document data into blocks in a predetermined manner according to a structure of the document data; document structuring means for structuring the document data, thereby generating structured data, by parsing a block into which the document data is divided by said block dividing means according to the document structure of the block, and by adding tag information to text data constituting the block, said tag information indicating an attribute of the text data; and sentence extraction means for controlling an extraction of the text data according to the structured data and a predetermined condition, wherein the predetermined condition provides an indication of a method to be utilized to perform the extraction of the text data; wherein said document structuring means includes regular-expression determining means which refers to pattern information containing a two-dimensional regular expression for a two-dimensional character string and tag information associated with the regular expression, and which adds the tag information associated with the regular expression in pattern information to a character string in the block that matches the regular expression in the pattern information, before a sentence is extracted; and, wherein said sentence extraction means expresses the document data, which is structured according to the tag information generated by said document structuring means, as tree-structured data, and includes a sentence-extraction template which is paired with the tree-structured data in which each node is associated with an extraction control flag, and if the extraction control flag prohibits extraction of the text data, said sentence extraction means does not extract the text data flagged with the extraction control flag; template registering means for allowing a user to register an extraction control using the extraction control flag in the sentence-extraction template; and template search means for searching the sentence-extraction template registered by said template registering means using a fuzzy search by which a template that does not completely match a search condition meets the search condition, wherein the sentence-extraction template searched by said template search means is adapted to the tree-structured data.

이 특허에 인용된 특허 (9)

Nakao, Yoshio, Apparatus and method for generating a summary according to hierarchical structure of topic.
상세보기
Tateno Masakazu,JPX, Apparatus and method for searching through compressed, structured documents.
상세보기
Tateno Masakazu,JPX, Apparatus and method for storing, searching for and retrieving text of a structured document provided with tags.
상세보기
Nakatani Eisaku (Oome JPX), Document processing apparatus for extracting a format from one document and using the extracted format to automatically.
상세보기
Iwai Isamu (Kawasaki JPX) Okamoto Toshio (Bunkyo JPX) Doi Miwako (Kawasaki JPX), Document processing system deciding apparatus provided with selection functions.
상세보기
Green, Robin A. R., Document retrieval system and search method using word set and character look-up tables.
상세보기
Ravi Kumar ; Paul William Weschler, Jr., External data store link for a profile service.
상세보기
Rheaume Gary P., Method for processing a file to generate a database.
상세보기
Claude Vogel, System and method for parsing a document using one or more break characters.
상세보기

이 특허를 인용한 특허 (31)

Dinh, Thu-Tram T.; Ho, Shyh-Mei F.; Hung, Jenny ChengYin; Lo, Kevin Yu Chang, Apparatus for facilitating transactions between thin-clients and message format service (MFS)-based information management systems (IMS) applications.
상세보기
Ho, Shyh Mei F.; Tsai, Tony Y., Apparatus, system, and method for automatically generating a web interface for an MFS-based IMS application.
상세보기
Knoblach, Gerald Mark; Frische, Eric A, Breaking apart a platform upon pending collision.
상세보기
Fanfant, Robert E.; Kaminsky, Ryan S.; Grosenick, Scott S., Data processing through use of a context.
상세보기
Simmons, Alex J.; Nickolov, Radoslav P.; Baer, Peter; Lascaux, Vincent; Kofman, Igor, Image text to character information conversion.
상세보기
Fukuda,Kentaro, Information processing apparatus, program, and recording medium.
상세보기
Hancsarik, Robert M.; Humphrey, Douglas E., Method and data structure for exchanging data.
상세보기
Dinh, Thu-Tram T.; Ho, Shyh-Mei F.; Hung, Jenny ChengYin; Yo, Kevin Yu Chang, Method for facilitating transactions between thin-clients and message format service (MFS)-based information management system (IMS) applications.
상세보기
Knoblach, Gerald Mark, Multifunctional balloon membrane.
상세보기
Schultz, Dale M.; Hudson, Roy, Pre-translation testing of bi-directional language display.
상세보기
Schultz, Dale M.; Hudson, Roy, Pre-translation testing of bi-directional language display.
상세보기
Wason, James R., Rich text handling for a web application.
상세보기
Wason, James R., Rich text handling for a web application.
상세보기
Wason, James R., Rich text handling for a web application.
상세보기
Wason, James R., Rich text handling for a web application.
상세보기
Wason, James R., Rich text handling for a web application.
상세보기
Wason, James R., Rich text handling for a web application.
상세보기
Haller, Daniel M.; Ho, Shyh-Mei F.; Hughes, Gerald D.; Hung, Jenny C.; Huyah, Bill T.; Kuo, Steve T., System and method for facilitating XML enabled IMS transactions.
상세보기
Chiang, Chenhuei J.; Ho, Shyh-Mei F.; Sheats, Benjamin Johnson; Yep, Eddie Raymond, System and method for representing MFS control blocks in XML for MFS-based IMS applications.
상세보기
Chiang,Chenhuei J.; Ho,Shyh Mei F.; Sheats,Benjamin Johnson; Yep,Eddie Raymond, System and method for representing MFS control blocks in XML for MFS-based IMS applications.
상세보기
Haller,Daniel M.; Ho,Shyh Mei F.; Hughes,Gerald D.; Hung,Jenny C.; Huynh,Bill T.; Kuo,Steve T., System and method to facilitate XML enabled IMS transactions between a remote client and an IMS application program.
상세보기
Dinh,Thu Tram T.; Ho,Shyh Mei F.; Hung,Jenny ChengYin; Lo,Kevin Yu Chang, System for facilitating transactions between thin-clients and message format service (MFS)-based information management system (IMS) applications.
상세보기
Knoblach, Gerald M.; Frische, Eric A.; Barkley, Bruce Alan, Systems and applications of lighter-than-air (LTA) platforms.
상세보기
Knoblach, Gerald M.; Frische, Eric A.; Barkley, Bruce Alan, Systems and applications of lighter-than-air (LTA) platforms.
상세보기
Knoblach, Gerald M.; Frische, Eric A.; Barkley, Bruce Alan, Systems and applications of lighter-than-air (LTA) platforms.
상세보기
Knoblach, Gerald M.; Frische, Eric A.; Barkley, Bruce Alan, Systems and applications of lighter-than-air (LTA) platforms.
상세보기
Knoblach, Gerald M.; Frische, Eric A.; Barkley, Bruce Alan, Systems and applications of lighter-than-air (LTA) platforms.
상세보기
Knoblach, Gerald M.; Frische, Eric A.; Barkley, Bruce Alan, Systems and applications of lighter-than-air (LTA) platforms.
상세보기
Knoblach, Gerald M.; Frische, Eric A.; Barkley, Bruce Alan, Systems and applications of lighter-than-air (LTA) platforms.
상세보기
Knoblach, Gerald M.; Frische, Eric A.; Barkley, Bruce Alan, Systems and applications of lighter-than-air (LTA) platforms.
상세보기
Knoblach, Gerald M; Frische, Eric A, Unmanned lighter-than-air-safe termination and recovery methods.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Document processing apparatus, document processing method, document processing program and recording medium 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (9)

이 특허를 인용한 특허 (31)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Document processing apparatus, document processing method, document processing program and recording medium 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (9)

이 특허를 인용한 특허 (31)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트