IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0240437
(2008-09-29)
|
등록번호 |
US-8352268
(2013-01-08)
|
발명자
/ 주소 |
- Naik, DeVang
- Silverman, Kim
- Bellegarda, Jerome
|
출원인 / 주소 |
|
대리인 / 주소 |
Morgan, Lewis & Bockius LLP
|
인용정보 |
피인용 횟수 :
93 인용 특허 :
241 |
초록
▼
Algorithms for synthesizing speech used to identify media assets are provided. Speech may be selectively synthesized form text strings associated with media assets. A text string may be normalized and its native language determined for obtaining a target phoneme for providing human-sounding speech i
Algorithms for synthesizing speech used to identify media assets are provided. Speech may be selectively synthesized form text strings associated with media assets. A text string may be normalized and its native language determined for obtaining a target phoneme for providing human-sounding speech in a language (e.g., dialect or accent) that is familiar to a user. The algorithms may be implemented on a system including several dedicated render engines. The system may be part of a back end coupled to a front end including storage for media assets and associated synthesized speech, and a request processor for receiving and processing requests that result in providing the synthesized speech. The front end may communicate media assets and associated synthesized speech content over a network to host devices coupled to portable electronic devices on which the media assets and synthesized speech are played back.
대표청구항
▼
1. A method for customizing delivery of synthesized speech, the method comprising: generating a speech segment from one or more text strings describing or identifying a media asset having audio data distinct from the generated speech segment;obtaining user input requesting a variation in speech deli
1. A method for customizing delivery of synthesized speech, the method comprising: generating a speech segment from one or more text strings describing or identifying a media asset having audio data distinct from the generated speech segment;obtaining user input requesting a variation in speech delivery accompanying the media asset;in response to the user input, customizing the speech segment by modifying selected portions of the speech segment at a server device, wherein the customizing further comprises: automatically detecting one or more repeated portions in the speech segment; andautomatically modifying the speech segment by performing one or more of: (1) omitting at least one of the repeated portions from the speech segment, (2) using faster speech patterns for at least one of the repeated portions, (3) shortening breaks between words in at least one of the repeated portions, and (4) truncating one or more phrases in at least one of the repeated portions; andproviding the customized speech segment from the server device to a user device for playback with the media asset. 2. The method of claim 1 wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: shortening breaks between words within the speech segment to generate the customized speech segment. 3. The method of claim 1 wherein the user input specifies one or more preferred information fields among a plurality of information fields available in the speech segment. 4. The method of claim 1 wherein the user input requests at least one of fast forwarding and skipping playback of speech content at the user device. 5. The method of claim 1 wherein the user input requests omission of repeated information from speech content delivered to the user device. 6. The method of claim 1 wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: including in the customized speech segment respective portions of the speech segment corresponding to one or more user-selected information fields, while omitting at least one field of information in the speech segment from the customized speech segment. 7. The method of claim 1, further comprising: detecting user input fast forwarding or skipping playback of at least a first speech segment previously delivered to the user device; andin response to the detecting, modifying a delivery rate for a second speech segment to be delivered from the client device to the user device. 8. The method of claim 1, further comprising: detecting user input fast forwarding or skipping playback of at least a first speech segment previously delivered to the user device; andin response to the detecting, customizing speech delivery for a second speech segment to be delivered from the client device to the user device. 9. The method of claim 8, wherein customizing speech delivery for the second speech segment comprises at least one of: (1) shortening breaks between words within the second speech segment before delivering the second speech segment to the user device, (2) truncating one or more phrases within the second speech segment before delivering the second speech segment to the user device, and (3) omitting delivery of the second speech segment to the user device. 10. A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors, cause the one or more processors to: generate a speech segment from one or more text strings describing or identifying a media asset having audio data distinct from the generated speech segment;obtain user input requesting a variation in speech delivery accompanying the media asset;in response to the user input, customize the speech segment by modifying selected portions of the speech segment at a server device, wherein the customizing further comprises: automatically detecting one or more repeated portions in the speech segment; andautomatically modifying the speech segment by performing one or more of: (1) omitting at least one of the repeated portions from the speech segment, (2) using faster speech patterns for at least one of the repeated portions, (3) shortening breaks between words in at least one of the repeated portions, and (4) truncating one or more phrases in at least one of the repeated portions; andprovide the customized speech segment from the server device to a user device for playback with the media asset. 11. The computer-readable storage medium of claim 10 wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: shortening breaks between words within the speech segment to generate the customized speech segment. 12. The computer-readable storage medium of claim 10 wherein the user input specifies one or more preferred information fields among a plurality of information fields available in the speech segment. 13. The computer-readable storage medium of claim 10 wherein the user input requests at least one of fast forwarding and skipping playback of speech content at the user device. 14. The computer-readable storage medium of claim 10 wherein the user input requests omission of repeated information from speech content delivered to the user device. 15. The computer-readable storage medium of claim 10 wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: including in the customized speech segment respective portions of the speech segment corresponding to one or more user-selected information fields, while omitting at least one field of information in the speech segment from the customized speech segment. 16. The computer-readable storage medium of claim 10, wherein the instructions further cause the one or more processors to: detect user input fast forwarding or skipping playback of at least a first speech segment previously delivered to the user device; andin response to the detecting, modify a delivery rate for a second speech segment to be delivered from the client device to the user device. 17. The computer-readable storage medium of claim 10, wherein the instructions further cause the one or more processors to: detect user input fast forwarding or skipping playback of at least a first speech segment previously delivered to the user device; andin response to the detecting, customize speech delivery for a second speech segment to be delivered from the client device to the user device. 18. The computer-readable storage medium of claim 17, wherein customizing speech delivery for the second speech segment comprises at least one of: (1) shortening breaks between words within the second speech segment before delivering the second speech segment to the user device, (2) truncating one or more phrases within the second speech segment before delivering the second speech segment to the user device, and (3) omitting delivery of the second speech segment to the user device. 19. A system, comprising: one or more processors; andmemory, the memory storing one or more programs, the one or more programs comprising instructions, which when executed by the one or more processors, cause the one or more processors to:generate a speech segment from one or more text strings describing or identifying a media asset having audio data distinct from the generated speech segment;obtain user input requesting a variation in speech delivery accompanying the media asset;in response to the user input, customize the speech segment by modifying selected portions of the speech segment at a server device, wherein the customizing further comprises: automatically detecting one or more repeated portions in the speech segment; andautomatically modifying the speech segment by performing one or more of: (1) omitting at least one of the repeated portions from the speech segment, (2) using faster speech patterns for at least one of the repeated portions, (3) shortening breaks between words in at least one of the repeated portions, and (4) truncating one or more phrases in at least one of the repeated portions; andprovide the customized speech segment from the server device to a user device for playback with the media asset. 20. The system of claim 19 wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: shortening breaks between words within the speech segment to generate the customized speech segment. 21. The system of claim 19 wherein the user input specifies one or more preferred information fields among a plurality of information fields available in the speech segment. 22. The system of claim 19 wherein the user input requests at least one of fast forwarding and skipping playback of speech content at the user device. 23. The system of claim 19 wherein the user input requests omission of repeated information from speech content delivered to the user device. 24. The system of claim 19 wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: including in the customized speech segment respective portions of the speech segment corresponding to one or more user-selected information fields, while omitting at least one field of information in the speech segment from the customized speech segment. 25. The system of claim 19, wherein the instructions further cause the one or more processors to: detect user input fast forwarding or skipping playback of at least a first speech segment previously delivered to the user device; andin response to the detecting, modify a delivery rate for a second speech segment to be delivered from the client device to the user device. 26. The system of claim 19, wherein the instructions further cause the one or more processors to: detect user input fast forwarding or skipping playback of at least a first speech segment previously delivered to the user device; andin response to the detecting, customize speech delivery for a second speech segment to be delivered from the client device to the user device. 27. The system of claim 26, wherein customizing speech delivery for the second speech segment comprises at least one of: (1) shortening breaks between words within the second speech segment before delivering the second speech segment to the user device, (2) truncating one or more phrases within the second speech segment before delivering the second speech segment to the user device, and (3) omitting delivery of the second speech segment to the user device. 28. A method for customizing delivery of synthesized speech, the method comprising: generating a speech segment from one or more text strings associated with or identifying a media asset;obtaining user input requesting a variation in speech delivery accompanying the media asset;in response to the user input, customizing the speech segment by modifying selected portions of the speech segment at a server device; andproviding the customized speech segment from the server device to a user device for playback with the media asset,wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: automatically detecting one or more repeated portions in the speech segment; andautomatically modifying the speech segment by performing one or more of: (1) omitting at least one of the repeated portions from the speech segment, (2) using faster speech patterns for at least one of the repeated portions, (3) shortening breaks between words in at least one of the repeated portions, and (4) truncating one or more phrases in at least one of the repeated portions. 29. A non-transitory computer-readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by one or more processors, cause the one or more processors to: generate a speech segment from one or more text strings associated with or identifying a media asset;obtain user input requesting a variation in speech delivery accompanying the media asset;in response to the user input, customize the speech segment by modifying selected portions of the speech segment at a server device; andprovide the customized speech segment from the server device to a user device for playback with the media asset,wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: automatically detecting one or more repeated portions in the speech segment; andautomatically modifying the speech segment by performing one or more of: (1) omitting at least one of the repeated portions from the speech segment, (2) using faster speech patterns for at least one of the repeated portions, (3) shortening breaks between words in at least one of the repeated portions, and (4) truncating one or more phrases in at least one of the repeated portions. 30. A system, comprising: one or more processors; andmemory, the memory storing one or more programs, the one or more programs comprising instructions, which when executed by the one or more processors, cause the one or more processors to:generate a speech segment from one or more text strings associated with or identifying a media asset;obtain user input requesting a variation in speech delivery accompanying the media asset;in response to the user input, customize the speech segment by modifying selected portions of the speech segment at a server device; andprovide the customized speech segment from the server device to a user device for playback with the media asset,wherein customizing the speech segment by modifying selected portions of the speech segment further comprises: automatically detecting one or more repeated portions in the speech segment; andautomatically modifying the speech segment by performing one or more of: (1) omitting at least one of the repeated portions from the speech segment, (2) using faster speech patterns for at least one of the repeated portions, (3) shortening breaks between words in at least one of the repeated portions, and (4) truncating one or more phrases in at least one of the repeated portions.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.