[미국특허]
Context sensitive text recognition and marking from speech
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G06F-003/00
G06F-003/13
출원번호
US-0421601
(2006-06-01)
등록번호
US-8171412
(2012-05-01)
발명자
/ 주소
Sand, Anne R.
Miller, Steven M.
출원인 / 주소
International Business Machines Corporation
대리인 / 주소
Bauer, Andrea
인용정보
피인용 횟수 :
1인용 특허 :
12
초록▼
A visual presentation system and method for synchronizing presentation data being viewed in a display with speech input. A system is disclosed that includes: a speech recognition system for recognizing speech input; an association system for determining a context of the speech input and matching the
A visual presentation system and method for synchronizing presentation data being viewed in a display with speech input. A system is disclosed that includes: a speech recognition system for recognizing speech input; an association system for determining a context of the speech input and matching the context with a relevant portion of the presentation data; and a visual coordination system for coordinating the display of a data item from the presentation data based on a match made by the association system.
대표청구항▼
1. A visual presentation system for synchronizing a display of a set of presentation with a speech input of a speaker, the system comprising: a speech recognition system for processing the speech input of the speaker during a visual presentation,wherein the processing of the speech input of the spea
1. A visual presentation system for synchronizing a display of a set of presentation with a speech input of a speaker, the system comprising: a speech recognition system for processing the speech input of the speaker during a visual presentation,wherein the processing of the speech input of the speaker includes converting the speech input into a spoken information data set;an association system for processing the spoken information data set to determine a display order of portions of the set of presentation data,wherein relevant portions of the set of presentation data are displayed in an order based upon the speech input of the speaker,wherein the processing of the spoken information data set includes: preprocessing the set of presentation data to exclude a predetermined list of terms from consideration in a presentation data match determination;determining a context of the speech input by analyzing speech patterns in the speech input over a predetermined time interval; anddetermining a set of matches of the context with the relevant portions of the set of presentation data; anda visual coordination system for coordinating the display of the relevant portions of the set of presentation data based on the set of matches determined by the association system,wherein the coordinating of the display of the relevant portions includes adjusting the display order of the relevant portions to synchronize the visual presentation with the speech input, andwherein the visual coordination system includes a user selection system for allowing a user to select which of the relevant portions to display from among the set of matches during the visual presentation. 2. The visual presentation system of claim 1, wherein the association system includes a system for controlling a sensitivity for determining the context. 3. The visual presentation system of claim 1, wherein the preprocessing is performed prior to the beginning of the visual presentation and the relevant portion of the set of presentation data is selected from the group consisting of: data items, metadata and location data. 4. The visual presentation system of claim 1, wherein the context is-determined based on a frequency, volume or speed of an uttered set of words in the speech input. 5. The visual presentation system of claim 1, wherein the visual coordination system selects and displays locations within the set of presentation data, wherein the locations are displayed independently to an audience on an audience display and the user on a user display and are selected from the group consisting of: a view, a word, a phrase, a text segment, a graphic object, a visual element, a section, a slide, and a page. 6. The visual presentation system of claim 1, wherein the visual coordination system further includes a system for marking data items and saving an output of the presentation. 7. The visual presentation system of claim 6, wherein the visual coordination system includes a marking selected from the group consisting of: a first type of marking for visually identifying a data item in the display that is yet to be discussed;a second type of marking for visually identifying a data item in the display currently being discussed; anda third type of marking for visually identifying a data item in the display that was previously discussed. 8. A method for synchronizing a display of a set of presentation data with a set of speech inputs of a speaker, the method comprising: preprocessing the set of presentation data to exclude a predetermined list of terms from consideration in a presentation data match determination;capturing a speech input of the speaker during a visual presentation;providing a speech recognition system to process the speech input,wherein the processing of the speech input includes converting the speech input into a spoken information data set;determining a context of the speech input based on the spoken information data set,wherein the context is determined in response to a word or a phrase being recognized a predetermined plurality of times over a predetermined time interval;matching the context with a relevant portion of the set of presentation data; andcoordinating the display of the relevant portion of the set of presentation data based on the matching,wherein the coordinating of the display of the relevant portion includes automatically adjusting the display of the relevant portion to synchronize the visual presentation with the speech input. 9. The method of claim 8, wherein the predetermined plurality of times is adjustable via a sensitivity control. 10. The method of claim 8, wherein the preprocessing is performed prior to the beginning of the visual presentation and the relevant portion of the set of presentation data is selected from the group consisting of: data items, metadata, and location data. 11. The method of claim 8, wherein the context is determined based on a frequency, volume or speed of an uttered set of words in the speech input. 12. The method of claim 8, wherein the coordinating step selects and displays locations within the set of presentation data, wherein the locations are displayed independently to an audience on an audience display and the user on a user display and are selected from the group consisting of: a view, a word, a phrase, a text segment, a graphic object, a visual element, a section, a slide, and a page. 13. The method of claim 8, wherein the coordinating step further includes marking data items and saving an output of the presentation. 14. The method of claim 13, wherein the coordinating step includes a step selected from the group consisting of: using a first type of marking for visually identifying a data item in the display that is yet to be discussed;using a second type of marking for visually identifying a data item in the display currently being discussed; andusing a third type of marking for visually identifying a data item in the display that was previously discussed. 15. A computer program product stored on a computer useable medium for synchronizing a display of a set of presentation data with a set of speech inputs of a speaker, the program product comprising: program code configured to preprocess the set of presentation data to exclude a predetermined list of terms from consideration in a presentation data match determination;program code configured for determining a context of a speech input over a predetermined time interval,wherein the context is determined by analyzing speech patterns in the speech input of the speaker over the predetermined time interval;program code configured for matching the context with a plurality of the relevant portions of the set of presentation data; andprogram code configured for coordinating the display of the relevant portions of the set of presentation data during a visual presentation based on the matching,wherein the coordinating of the display of the relevant portions includes automatically adjusting the display of the relevant portions to synchronize the visual presentation with the speech input. 16. The computer program product of claim 15, wherein the program code for determining the context is adjustable via a sensitivity control. 17. The computer program product of claim 15, wherein the preprocessing is performed prior to the beginning of the visual presentation and the relevant portion of the set of presentation data is selected from the group consisting of: data items, metadata, and location data. 18. The computer program product of claim 15, wherein the context is determined based on a frequency, volume or speed of an uttered set of words in the speech input. 19. The computer program product of claim 15, wherein the program code configured for coordinating the display of the presentation data selects and displays a plurality of matching data items, and provides a user selection system for allowing a user to select which relevant portions to display. 20. The computer program product of claim 15, wherein the program code configured for coordinating the display of the presentation data automatically selects which relevant portions to display from the plurality of matches. 21. The computer program product of claim 20, wherein the program code configured for coordinating the display of the presentation data includes a function selected from the group consisting of: using a first type of marking for visually identifying a data item in the display that is yet to be discussed;using a second type of marking for visually identifying a data item in the display currently being discussed; andusing a third type of marking for visually identifying a data item in the display that was previously discussed. 22. A method for deploying a system synchronizing a display of a set of presentation data with a set of speech inputs of a speaker, the method comprising: providing a computer infrastructure being operable to: preprocess the set of presentation data to exclude a predetermined list of terms from consideration in a presentation data match determination;determine a context of a speech input over a predetermined time interval,wherein the context is determined in response to a word or phrase being recognized a predetermined plurality of times over the predetermined time interval;match the context with a relevant portion of the set of presentation data; andcoordinate the display of the set of presentation data based on the matching step, including displaying the relevant portion of the set of presentation data,wherein the coordinating of the display of the relevant portion includes automatically adjusting the display of the relevant portion to synchronize a visual presentation with the speech input.
Drake Samuel (San Jose CA) Griefer Allan D. (San Jose CA) Powers ; Jr. John T. (Morgan Hill CA) Thomas John G. (Santa Cruz CA), Automated presentation capture, storage and playback system.
Eastwood Peter Rowland ; Happ Alan J. ; Klein Alice G. ; Kruse Daniel William ; Milenkovic Maria, Display indications of speech processing states in speech recognition system.
Rtischev Dimitry (Menlo Park CA) Bernstein Jared C. (Palo Alto CA) Chen George T. (Menlo Park CA) Butzberger John W. (Foster City CA), Method and apparatus for voice-interactive language instruction.
Brocious,Larry A.; Gabel,Jonathan L.; Loose,David C.; VanBuskirk,Ronald E.; Wang,Huifang; Woodward,Steven G., Method, system, and apparatus for limiting available selections in a speech recognition system.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.