최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
SAI
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
DataON 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Edison 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Kafe 바로가기국가/구분 | United States(US) Patent 등록 |
---|---|
국제특허분류(IPC7판) |
|
출원번호 | US-0090544 (2016-04-04) |
등록번호 | US-9626955 (2017-04-18) |
발명자 / 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 | 피인용 횟수 : 6 인용 특허 : 2022 |
Techniques for improved text-to-speech processing are disclosed. The improved text-to-speech processing can convert text from an electronic document into an audio output that includes speech associated with the text as well as audio contextual cues. One aspect provides audio contextual cues to the l
Techniques for improved text-to-speech processing are disclosed. The improved text-to-speech processing can convert text from an electronic document into an audio output that includes speech associated with the text as well as audio contextual cues. One aspect provides audio contextual cues to the listener when outputting speech (spoken text) pertaining to a document. The audio contextual cues can be based on an analysis of a document prior to a text-to-speech conversion. Another aspect can produce an audio summary for a file. The audio summary for a document can thereafter be presented to a user so that the user can hear a summary of the document without having to process the document to produce its spoken text via text-to-speech conversion.
1. A method for converting text to speech, the method comprising: at an electronic device with a processor and memory storing one or more programs for execution by the processor: parsing a document to identify a plurality of text elements in the document to be converted to speech, the subset of text
1. A method for converting text to speech, the method comprising: at an electronic device with a processor and memory storing one or more programs for execution by the processor: parsing a document to identify a plurality of text elements in the document to be converted to speech, the subset of text having a markup tag;creating an announcement comprising a spoken description of a context related to the markup tag; andgenerating audio that includes the spoken form of the subset of text and the announcement, wherein the announcement is spoken prior to the spoken form of the subset of text. 2. The method of claim 1, wherein the context is a footnote. 3. The method of claim 1, wherein the context is a title. 4. The method of claim 1, further comprising: identifying a second subset of text while parsing the document, the second subset of text having a second markup tag that is different from the markup tag; andcreating a second announcement comprising a spoken description of a second context; wherein the generated audio includes a spoken form of the second subset of text and the second announcement, wherein the second announcement is spoken prior to the spoken form of the second subset of text. 5. The method of claim 1, wherein the document does not include text corresponding to the announcement. 6. The method of claim 1, further comprising: identifying a non-text element of the document while parsing the document; andcreating an audio cue that represents the non-text element in the document, wherein the generated audio includes the audio cue. 7. The method of claim 6, wherein the non-text element is an image. 8. The method of claim 6, wherein the non-text element is a hyperlink. 9. The method of claim 1, further comprising: generating a text-to-speech processing script that includes the subset of text and the announcement, wherein the text-to-speech processing script is processed to generate the audio. 10. A non-transitory computer-readable storage medium comprising instructions for causing one or more processors to: parse a document to identify a plurality of text elements in the document to be converted to speech, the subset of text having a markup tag;create an announcement comprising a spoken description of a context related to the markup tag; andgenerate audio that includes the spoken form of the subset of text and the announcement, wherein the announcement is spoken prior to the spoken form of the subset of text. 11. The non-transitory computer-readable storage medium of claim 10, wherein the context is a footnote. 12. The non-transitory computer-readable storage medium of claim 10, wherein the context is a title. 13. The non-transitory computer-readable storage medium of claim 10, further comprising instructions for: identifying a second subset of text while parsing the document, the second subset of text having a second markup tag that is different from the markup tag; andcreating a second announcement comprising a spoken description of a second context; wherein the generated audio includes a spoken form of the second subset of text and the second announcement, wherein the second announcement is spoken prior to the spoken form of the second subset of text. 14. The non-transitory computer-readable storage medium of claim 10, wherein the document does not include text corresponding to the announcement. 15. The non-transitory computer-readable storage medium of claim 10, further comprising: identifying a non-text element of the document while parsing the document; andcreating an audio cue that represents the non-text element in the document, wherein the generated audio includes the audio cue. 16. The non-transitory computer-readable storage medium of claim 15, wherein the non-text element is an image. 17. The non-transitory computer-readable storage medium of claim 15, wherein the non-text element is a hyperlink. 18. An electronic device comprising: one or more processors;memory;one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for: parsing a document to identify a plurality of text elements in the document to be converted to speech, the subset of text having a markup tag;creating an announcement comprising a spoken description of a context related to the markup tag; andgenerating audio that includes the spoken form of the subset of text and the announcement, wherein the announcement is spoken prior to the spoken form of the subset of text. 19. The electronic device of claim 18, wherein the context is a footnote. 20. The electronic device of claim 18, wherein the context is a title. 21. The electronic device of claim 18, wherein the one or more programs further comprise instructions for: identifying a second subset of text while parsing the document, the second subset of text having a second markup tag that is different from the markup tag; andcreating a second announcement comprising a spoken description of a second context; wherein the generated audio includes a spoken form of the second subset of text and the second announcement, wherein the second announcement is spoken prior to the spoken form of the second subset of text. 22. The electronic device of claim 18, wherein the document does not include text corresponding to the announcement. 23. The electronic device of claim 18, wherein the one or more programs further comprise instructions for: identifying a non-text element of the document while parsing the document; andcreating an audio cue that represents the non-text element in the document, wherein the generated audio includes the audio cue. 24. The electronic device of claim 23, wherein the non-text element is an image. 25. The electronic device of claim 23, wherein the non-text element is a hyperlink.
해당 특허가 속한 카테고리에서 활용도가 높은 상위 5개 콘텐츠를 보여줍니다.
더보기 버튼을 클릭하시면 더 많은 관련자료를 살펴볼 수 있습니다.
IPC | Description |
---|---|
A | 생활필수품 |
A62 | 인명구조; 소방(사다리 E06C) |
A62B | 인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00) |
A62B-1/08 | .. 윈치 또는 풀리에 제동기구가 있는 것 |
내보내기 구분 |
|
---|---|
구성항목 |
관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents |
저장형식 |
|
메일정보 |
|
안내 |
총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. ~ |
Copyright KISTI. All Rights Reserved.
AI-Helper는 오픈소스 모델을 사용합니다. 사용하고 있는 오픈소스 모델과 라이센스는 아래에서 확인할 수 있습니다.
AI-Helper uses Open Source Models. You can find the source code of these open source models, along with applicable license information below. (helpdesk@kisti.re.kr)
OpenAI의 API Key를 브라우저에 등록하여야 ChatGPT 모델을 사용할 수 있습니다.
등록키는 삭제 버튼을 누르거나, PDF 창을 닫으면 삭제됩니다.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.