[특허]Aural skimming and scrolling

Aural skimming and scrolling 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G10L-013/00 G10L-013/08 G10L-021/00 G10L-025/00 G06F-017/00 G06F-017/20 G10L-013/027
출원번호	US-0600346 (2006-11-15)
등록번호	US-9087507 (2015-07-21)
우선권정보	IN-2035/DEL/2006 (2006-09-15)
발명자 / 주소	Sengamedu, Srinivasan H.
출원인 / 주소	Yahoo! Inc.
대리인 / 주소	Hickman Palermo Becker Bingham LLP
인용정보	피인용 횟수 : 1 인용 특허 : 29

초록 ▼

Computer-based skimming and scrolling of aurally presented information is described. Different levels of skimming are achieved in aural presentations with allowing a user to navigate an aural presentation according to significant points identified within an information source. The significant points are identified using various indicia that suggest logical arrangements for the information contained within the source, such as semantics, syntax, typography, formatting, named entities, and markup tags. The identified significant points signal changes in playback mode for the audio presentation, such as different tones, pitches, volumes, or voices. Similar indicia may be used to generate identifying markers from the information source that can be aurally presented in lieu of the information source itself to allow for aural scrolling of the information.

대표청구항 ▼

1. A computer-implemented method for aurally scrolling an information source, comprising: analyzing an information source;wherein the information source comprises a plurality of markup tags;wherein analyzing the information source comprises using the plurality of markup tags to identify a plurality of segments of the information source from which to derive corresponding marker texts;generating and storing, separate from the information source, a set of a plurality of marker texts based at least on the analyzing of the information source including generating each marker text in the set of marker texts based at least on an analysis of a corresponding segment, of the plurality of identified segments, of the information source;wherein the analysis of a particular segment, of the plurality of identified segments, corresponding to a particular marker text of the set of marker texts comprises applying a summarization technique to the particular segment to derive the particular marker text;wherein the analysis of the particular segment comprises determining a significance of the particular segment based at least in part on a relative amount of text content of the particular segment;generating and storing data that comprises, for each marker text in the set of marker texts, an association between the marker text and a location within the information source, the location corresponding to the segment of the information source that corresponds to the marker text;arranging the plurality of marker texts in a sequence, the particular marker text having an order in the sequence;wherein the order of the particular marker text in the sequence is dependent on the determined significance of the particular segment that was determined based at least in part on the relative amount of text content of the particular segment;initiating an aural presentation of the sequence, the aural presentation comprising computerized text-to-speech synthesis of at least a portion of the sequence;during the aural presentation of the sequence, receiving input while the particular marker text of the set of marker texts is being aurally presented; andin response to the input: ceasing the aural presentation of the particular marker text;inspecting the data to identify the location associated with the particular marker text; andinitiating an aural presentation of the information source at the location associated with the particular marker text, the aural presentation comprising computerized text-to-speech synthesis of at least a portion of the information source;wherein the method is performed by one or more computing devices. 2. The computer-implemented method as recited in claim 1, wherein the sequence corresponds to the chronological order of the associated locations within the information source. 3. The computer-implemented method as recited in claim 1, wherein the sequence corresponds to the sequential order of the associated locations within the information source. 4. The computer-implemented method as recited in claim 1, further comprising: aurally presenting at least a portion of the information source;wherein the sequence begins with a marker text of the set of marker texts associated with the location of a current playback point in the aural presentation. 5. The computer-implemented method as recited in claim 1, wherein the sequence corresponds to an order associated with the set of marker texts. 6. The computer-implemented method as recited in claim 1, wherein the sequence reflects a perceived significance of each marker text of the plurality of marker texts. 7. The computer-implemented method as recited in claim 1, wherein the set of marker texts comprises a first set of marker texts and a second set of marker texts, the method further comprising: storing metadata that indicates that the first set of marker texts have a first logical significance and that the at least second set of marker texts have at least a second logical significance. 8. The computer-implemented method as recited in claim 7, wherein the plurality of marker texts comprises one or more marker texts belonging to the first set of marker texts. 9. The computer-implemented method as recited in claim 1, wherein the input comprises at least one of an aural input and a text based input. 10. The computer-implemented method as recited in claim 1, wherein the input comprises at least one of a speech based input and a tactile input. 11. The computer-implemented method as recited in claim 10, wherein the tactile input is received from an interface comprising at least one of a keyboard, a mouse, a joystick, a touchpad, a sensor bearing glove, a speech input interface, and a button. 12. The computer-implemented method as recited in claim 1, wherein the information source comprises a text-based information source. 13. The computer-implemented method as recited in claim 1, wherein the information source comprises at least one of: an electronic mail message;output of a messaging client;a voicemail message;a document produced by an optical content recognition application;an electronic document;textual output of a software application;an audio stream with accompanying transcription; anda video stream with accompanying transcription. 14. The computer-implemented method as recited in claim 1, wherein, prior to the analyzing step, the information source is converted into representative text. 15. The computer-implemented method as recited in claim 1, wherein the particular marker text comprises an excerpt of the information source identified based on at least one of: a font characteristic of the information source that changes near the location associated with the particular marker text;a typographic characteristic of the information source that changes near the location associated with the particular marker text;a semantic significance of the information source identified near the location associated with the particular marker text;a syntactic significance of the information source identified near the location associated with the particular marker text;a named entity of the information source identified near the location associated with the particular marker text; anda markup tag of the information source identified near the location associated with the particular marker text. 16. The computer-implemented method as recited in claim 1, wherein the particular marker text is generated from an analysis of a segment of the information source at the location associated with the particular marker text, wherein the analysis comprises at least one of summarization, categorization, shallow parsing, grammar tagging, semantic tagging, and named entity recognition. 17. One or more non-transitory computer-readable media storing instructions which, when executed by one or more computing devices, cause performance of a computer-implemented method for aurally scrolling an information source comprising the steps of: analyzing an information source;wherein the information source comprises a plurality of markup tags;wherein analyzing the information source comprises using the plurality of markup tags to identify a plurality of segments of the information source from which to derive corresponding marker texts;generating and storing, separate from the information source, a set of a plurality of marker texts based at least on the analyzing of the information source including generating each marker text in the set of marker texts based at least on an analysis of a corresponding segment, of the plurality of identified segments, of the information source;wherein the analysis of a particular segment, of the plurality of identified segments, corresponding to a particular marker text of the set of marker texts comprises applying a summarization technique to the particular segment to derive the particular marker text;wherein the analysis of the particular segment comprises determining a significance of the particular segment based at least in part on a relative amount of text content of the particular segment;generating and storing data that comprises, for each marker text in the set of marker texts, an association between the marker text and a location within the information source, the location corresponding to the segment of the information source that corresponds to the marker text;arranging the plurality of marker texts in a sequence, the particular marker text having an order in the sequence;wherein the order of the particular marker text in the sequence is dependent on the determined significance of the particular segment that was determined based at least in part on the relative amount of text content of the particular segment;initiating an aural presentation of the sequence, the aural presentation comprising computerized text-to-speech synthesis of at least a portion of the sequence;during the aural presentation of the sequence, receiving input while the particular marker text of the set of marker texts is being aurally presented; andin response to the input: ceasing the aural presentation of the particular marker text;inspecting the data to identify the location associated with the particular marker text; andinitiating an aural presentation of the information source at the location associated with the particular marker text, the aural presentation comprising computerized text-to-speech synthesis of at least a portion of the information source. 18. The one or more non-transitory computer-readable media as recited in claim 17, wherein the sequence corresponds to the chronological order of the associated locations within the information source. 19. The one or more non-transitory computer-readable media as recited in claim 17, wherein the sequence corresponds to the sequential order of the associated locations within the information source. 20. The one or more non-transitory computer-readable media as recited in claim 17, the method further comprising: aurally presenting at least a portion of the information source;wherein the sequence begins with a marker text of the set of marker texts associated with the location of a current playback point in the aural presentation. 21. The one or more non-transitory computer-readable media as recited in claim 17, wherein the sequence corresponds to an order associated with the set of marker texts. 22. The one or more non-transitory computer-readable media as recited in claim 17, wherein the sequence reflects a perceived significance of each marker text of the plurality of marker texts. 23. The one or more non-transitory computer-readable media as recited in claim 17, wherein the set of marker texts comprises a first set of marker texts and a second set of marker texts, the method further comprising: storing metadata that indicates that the first set of marker texts have a first logical significance and that the at least second set of marker texts have at least a second logical significance. 24. The one or more non-transitory computer-readable media as recited in claim 23, wherein the plurality of marker texts comprises one or more marker texts belonging to the first set of marker texts. 25. The one or more non-transitory computer-readable media as recited in claim 17, wherein the input comprises at least one of an aural input and a text based input. 26. The one or more non-transitory computer-readable media as recited in claim 17, wherein the input comprises at least one of a speech based input and a tactile input. 27. The one or more non-transitory computer-readable media as recited in claim 26, wherein the tactile input is received from an interface comprising at least one of a keyboard, a mouse, a joystick, a touchpad, a sensor bearing glove, a speech input interface, and a button. 28. The one or more non-transitory computer-readable media as recited in claim 17 wherein the information source comprises a text-based information source. 29. The one or more non-transitory computer-readable media as recited in claim 17, wherein the information source comprises at least one of: an electronic mail message;output of a messaging client;a voicemail message;a document produced by an optical content recognition application;an electronic document;textual output of a software application;an audio stream with accompanying transcription; anda video stream with accompanying transcription. 30. The one or more non-transitory computer-readable media as recited in claim 17, wherein, prior to the analyzing step, the information source is converted into representative text. 31. The one or more non-transitory computer-readable media as recited in claim 17, wherein the particular marker text comprises an excerpt of the information source identified based on at least one of: a font characteristic of the information source that changes near the location associated with the particular marker text;a typographic characteristic of the information source that changes near the location associated with the particular marker text;a semantic significance of the information source identified near the location associated with the particular marker text;a syntactic significance of the information source identified near the location associated with the particular marker text;a named entity of the information source identified near the location associated with the particular marker text; anda markup tag of the information source identified near the location associated with the particular marker text. 32. The one or more non-transitory computer-readable media as recited in claim 17, wherein the particular marker text is generated from an analysis of a segment of the information source at the location associated with the particular marker text, wherein the analysis comprises at least one of summarization, categorization, shallow parsing, grammar tagging, semantic tagging, and named entity recognition.

이 특허에 인용된 특허 (29)

Dodrill,Lewis D.; Danner,Ryan A.; Martin,Steven J., Apparatus and methods for providing an audibly controlled user interface for audio-based communication devices.
상세보기
Dvorak Joseph L., Audio interface for document based information resource navigation and method therefor.
상세보기
MacKenty Edmund R. ; Owen David E., Auditorially representing pages of SGML data.
상세보기
Slotznick, Benjamin; Sheetz, Stephen C., Clickless user interaction with text-to-speech enabled web page for users who have reading difficulty.
상세보기
Nagao,Katashi, Electronic document processing apparatus.
상세보기
Nagao,Katashi, Electronic document processing apparatus and method for forming summary text and speech read-out.
상세보기
Brocious,Larry A.; Howland,Michael J.; Pritke,Steven M., Explicitly registering markup based on verbal commands and exploiting audio context.
상세보기
Gudrun Socher ; Mohan Vishwanath ; Anurag Mendhekar, Intelligent text-to-speech synthesis.
상세보기
Nolting, Daniel L., Media presentation system controlled by voice to text commands.
상세보기
Henton Caroline G., Method and apparatus for automatic generation of vocal emotion in a synthetic text-to-speech system.
상세보기
Packingham,Kevin; Roche,Elizabeth; Thenthiruperai,Balaji S., Method and system for bookmarking navigation points in a voice command title platform.
상세보기
Huffman James R. ; Jambhekar Shrirang Nikanth, Method and system for encoding a book for reading using an electronic book.
상세보기
Raman T. V. (Ithaca NY) Gries David (Ithaca NY), Method for generating audio renderings of digitized works having highly technical content.
상세보기
Slotznick,Benjamin; Sheetz,Stephen C., Method of displaying web pages to enable user access to text information that the user has difficulty reading.
상세보기
Sakai,Keiichi; Kosaka,Tetsuo, Multimodal document reception apparatus and multimodal document transmission apparatus, multimodal document transmission/reception system, their control method, and program.
상세보기
Kurzweil Raymond C. ; Bhathena Firdaus, Reading system which reads aloud from an image representation of a document.
상세보기
Ireton,Mark, Remote-directed management of media content.
상세보기
Yumura Takeshi,JPX ; Ohnishi Hiroki,JPX ; Miyatake Masanori,JPX ; Yoden Naoyuki,JPX ; Ochiiwa Masashi,JPX ; Izumi Takashi,JPX, Speech synthesis apparatus and read out time calculating apparatus to finish reading out text.
상세보기
Nielsen Jakob, Style sheets for speech-based presentation of web pages.
상세보기
O'Conor, William C.; Bradley, Nathan T., System and method for audible web site navigation.
상세보기
Premkumar V. Uppaluru, System and method for providing and using universally accessible voice and speech data files.
상세보기
Kryze, David; Rigazio, Luca; Nguyen, Patrick; Junqua, Jean-Claude, System and method of media file access and retrieval using speech recognition.
상세보기
Chen, Zesen; Chou, Peter; Aspell, Steve, System and method of providing audio content.
상세보기
Profit, Jr., Jack H.; Brown, N. Gregg; Mezey, Peter S.; Colombo, Lianne M., System and process for voice-controlled information retrieval.
상세보기
Thenthiruperai,Balaji S., Systems and method for archiving and retrieving navigation points in a voice command platform.
상세보기
Luther Willis J. (Irvine CA), Text parser for use with a text-to-speech converter.
상세보기
Holm Frode ; Pearson Steve, User interface controller for text-to-speech synthesizer.
상세보기
Kivimaki, Mika, User interface for text to speech conversion.
상세보기
Thrift,Philip R.; Hemphill,Charles T., Voice activated apparatus for accessing information on the World Wide Web.
상세보기

이 특허를 인용한 특허 (1)

Shih, Sheng-Yao; Kung, Yun-Chiang; Che, Chiwei; Wang, Chih-Chung, Audio output of a document from mobile device.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Aural skimming and scrolling 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (29)

이 특허를 인용한 특허 (1)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Aural skimming and scrolling 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (29)

이 특허를 인용한 특허 (1)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트