Computer-based skimming and scrolling of aurally presented information is described. Different levels of skimming are achieved in aural presentations with allowing a user to navigate an aural presentation according to significant points identified within an information source. The significant points
Computer-based skimming and scrolling of aurally presented information is described. Different levels of skimming are achieved in aural presentations with allowing a user to navigate an aural presentation according to significant points identified within an information source. The significant points are identified using various indicia that suggest logical arrangements for the information contained within the source, such as semantics, syntax, typography, formatting, named entities, and markup tags. The identified significant points signal changes in playback mode for the audio presentation, such as different tones, pitches, volumes, or voices. Similar indicia may be used to generate identifying markers from the information source that can be aurally presented in lieu of the information source itself to allow for aural scrolling of the information.
대표청구항▼
1. A computer-implemented method for aurally scrolling an information source, comprising: analyzing an information source;wherein the information source comprises a plurality of markup tags;wherein analyzing the information source comprises using the plurality of markup tags to identify a plurality
1. A computer-implemented method for aurally scrolling an information source, comprising: analyzing an information source;wherein the information source comprises a plurality of markup tags;wherein analyzing the information source comprises using the plurality of markup tags to identify a plurality of segments of the information source from which to derive corresponding marker texts;generating and storing, separate from the information source, a set of a plurality of marker texts based at least on the analyzing of the information source including generating each marker text in the set of marker texts based at least on an analysis of a corresponding segment, of the plurality of identified segments, of the information source;wherein the analysis of a particular segment, of the plurality of identified segments, corresponding to a particular marker text of the set of marker texts comprises applying a summarization technique to the particular segment to derive the particular marker text;wherein the analysis of the particular segment comprises determining a significance of the particular segment based at least in part on a relative amount of text content of the particular segment;generating and storing data that comprises, for each marker text in the set of marker texts, an association between the marker text and a location within the information source, the location corresponding to the segment of the information source that corresponds to the marker text;arranging the plurality of marker texts in a sequence, the particular marker text having an order in the sequence;wherein the order of the particular marker text in the sequence is dependent on the determined significance of the particular segment that was determined based at least in part on the relative amount of text content of the particular segment;initiating an aural presentation of the sequence, the aural presentation comprising computerized text-to-speech synthesis of at least a portion of the sequence;during the aural presentation of the sequence, receiving input while the particular marker text of the set of marker texts is being aurally presented; andin response to the input: ceasing the aural presentation of the particular marker text;inspecting the data to identify the location associated with the particular marker text; andinitiating an aural presentation of the information source at the location associated with the particular marker text, the aural presentation comprising computerized text-to-speech synthesis of at least a portion of the information source;wherein the method is performed by one or more computing devices. 2. The computer-implemented method as recited in claim 1, wherein the sequence corresponds to the chronological order of the associated locations within the information source. 3. The computer-implemented method as recited in claim 1, wherein the sequence corresponds to the sequential order of the associated locations within the information source. 4. The computer-implemented method as recited in claim 1, further comprising: aurally presenting at least a portion of the information source;wherein the sequence begins with a marker text of the set of marker texts associated with the location of a current playback point in the aural presentation. 5. The computer-implemented method as recited in claim 1, wherein the sequence corresponds to an order associated with the set of marker texts. 6. The computer-implemented method as recited in claim 1, wherein the sequence reflects a perceived significance of each marker text of the plurality of marker texts. 7. The computer-implemented method as recited in claim 1, wherein the set of marker texts comprises a first set of marker texts and a second set of marker texts, the method further comprising: storing metadata that indicates that the first set of marker texts have a first logical significance and that the at least second set of marker texts have at least a second logical significance. 8. The computer-implemented method as recited in claim 7, wherein the plurality of marker texts comprises one or more marker texts belonging to the first set of marker texts. 9. The computer-implemented method as recited in claim 1, wherein the input comprises at least one of an aural input and a text based input. 10. The computer-implemented method as recited in claim 1, wherein the input comprises at least one of a speech based input and a tactile input. 11. The computer-implemented method as recited in claim 10, wherein the tactile input is received from an interface comprising at least one of a keyboard, a mouse, a joystick, a touchpad, a sensor bearing glove, a speech input interface, and a button. 12. The computer-implemented method as recited in claim 1, wherein the information source comprises a text-based information source. 13. The computer-implemented method as recited in claim 1, wherein the information source comprises at least one of: an electronic mail message;output of a messaging client;a voicemail message;a document produced by an optical content recognition application;an electronic document;textual output of a software application;an audio stream with accompanying transcription; anda video stream with accompanying transcription. 14. The computer-implemented method as recited in claim 1, wherein, prior to the analyzing step, the information source is converted into representative text. 15. The computer-implemented method as recited in claim 1, wherein the particular marker text comprises an excerpt of the information source identified based on at least one of: a font characteristic of the information source that changes near the location associated with the particular marker text;a typographic characteristic of the information source that changes near the location associated with the particular marker text;a semantic significance of the information source identified near the location associated with the particular marker text;a syntactic significance of the information source identified near the location associated with the particular marker text;a named entity of the information source identified near the location associated with the particular marker text; anda markup tag of the information source identified near the location associated with the particular marker text. 16. The computer-implemented method as recited in claim 1, wherein the particular marker text is generated from an analysis of a segment of the information source at the location associated with the particular marker text, wherein the analysis comprises at least one of summarization, categorization, shallow parsing, grammar tagging, semantic tagging, and named entity recognition. 17. One or more non-transitory computer-readable media storing instructions which, when executed by one or more computing devices, cause performance of a computer-implemented method for aurally scrolling an information source comprising the steps of: analyzing an information source;wherein the information source comprises a plurality of markup tags;wherein analyzing the information source comprises using the plurality of markup tags to identify a plurality of segments of the information source from which to derive corresponding marker texts;generating and storing, separate from the information source, a set of a plurality of marker texts based at least on the analyzing of the information source including generating each marker text in the set of marker texts based at least on an analysis of a corresponding segment, of the plurality of identified segments, of the information source;wherein the analysis of a particular segment, of the plurality of identified segments, corresponding to a particular marker text of the set of marker texts comprises applying a summarization technique to the particular segment to derive the particular marker text;wherein the analysis of the particular segment comprises determining a significance of the particular segment based at least in part on a relative amount of text content of the particular segment;generating and storing data that comprises, for each marker text in the set of marker texts, an association between the marker text and a location within the information source, the location corresponding to the segment of the information source that corresponds to the marker text;arranging the plurality of marker texts in a sequence, the particular marker text having an order in the sequence;wherein the order of the particular marker text in the sequence is dependent on the determined significance of the particular segment that was determined based at least in part on the relative amount of text content of the particular segment;initiating an aural presentation of the sequence, the aural presentation comprising computerized text-to-speech synthesis of at least a portion of the sequence;during the aural presentation of the sequence, receiving input while the particular marker text of the set of marker texts is being aurally presented; andin response to the input: ceasing the aural presentation of the particular marker text;inspecting the data to identify the location associated with the particular marker text; andinitiating an aural presentation of the information source at the location associated with the particular marker text, the aural presentation comprising computerized text-to-speech synthesis of at least a portion of the information source. 18. The one or more non-transitory computer-readable media as recited in claim 17, wherein the sequence corresponds to the chronological order of the associated locations within the information source. 19. The one or more non-transitory computer-readable media as recited in claim 17, wherein the sequence corresponds to the sequential order of the associated locations within the information source. 20. The one or more non-transitory computer-readable media as recited in claim 17, the method further comprising: aurally presenting at least a portion of the information source;wherein the sequence begins with a marker text of the set of marker texts associated with the location of a current playback point in the aural presentation. 21. The one or more non-transitory computer-readable media as recited in claim 17, wherein the sequence corresponds to an order associated with the set of marker texts. 22. The one or more non-transitory computer-readable media as recited in claim 17, wherein the sequence reflects a perceived significance of each marker text of the plurality of marker texts. 23. The one or more non-transitory computer-readable media as recited in claim 17, wherein the set of marker texts comprises a first set of marker texts and a second set of marker texts, the method further comprising: storing metadata that indicates that the first set of marker texts have a first logical significance and that the at least second set of marker texts have at least a second logical significance. 24. The one or more non-transitory computer-readable media as recited in claim 23, wherein the plurality of marker texts comprises one or more marker texts belonging to the first set of marker texts. 25. The one or more non-transitory computer-readable media as recited in claim 17, wherein the input comprises at least one of an aural input and a text based input. 26. The one or more non-transitory computer-readable media as recited in claim 17, wherein the input comprises at least one of a speech based input and a tactile input. 27. The one or more non-transitory computer-readable media as recited in claim 26, wherein the tactile input is received from an interface comprising at least one of a keyboard, a mouse, a joystick, a touchpad, a sensor bearing glove, a speech input interface, and a button. 28. The one or more non-transitory computer-readable media as recited in claim 17 wherein the information source comprises a text-based information source. 29. The one or more non-transitory computer-readable media as recited in claim 17, wherein the information source comprises at least one of: an electronic mail message;output of a messaging client;a voicemail message;a document produced by an optical content recognition application;an electronic document;textual output of a software application;an audio stream with accompanying transcription; anda video stream with accompanying transcription. 30. The one or more non-transitory computer-readable media as recited in claim 17, wherein, prior to the analyzing step, the information source is converted into representative text. 31. The one or more non-transitory computer-readable media as recited in claim 17, wherein the particular marker text comprises an excerpt of the information source identified based on at least one of: a font characteristic of the information source that changes near the location associated with the particular marker text;a typographic characteristic of the information source that changes near the location associated with the particular marker text;a semantic significance of the information source identified near the location associated with the particular marker text;a syntactic significance of the information source identified near the location associated with the particular marker text;a named entity of the information source identified near the location associated with the particular marker text; anda markup tag of the information source identified near the location associated with the particular marker text. 32. The one or more non-transitory computer-readable media as recited in claim 17, wherein the particular marker text is generated from an analysis of a segment of the information source at the location associated with the particular marker text, wherein the analysis comprises at least one of summarization, categorization, shallow parsing, grammar tagging, semantic tagging, and named entity recognition.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (29)
Dodrill,Lewis D.; Danner,Ryan A.; Martin,Steven J., Apparatus and methods for providing an audibly controlled user interface for audio-based communication devices.
Sakai,Keiichi; Kosaka,Tetsuo, Multimodal document reception apparatus and multimodal document transmission apparatus, multimodal document transmission/reception system, their control method, and program.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.