[특허]Latency reduction for content playback

Latency reduction for content playback 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G06F-003/16 G10L-013/04 G06F-017/30
출원번호	US-0195464 (2016-06-28)
등록번호	US-9990176 (2018-06-05)
발명자 / 주소	Gray, Timothy Thomas
출원인 / 주소	Amazon Technologies, Inc.
대리인 / 주소	Pierce Atwood LLP
인용정보	피인용 횟수 : 0 인용 특허 : 4

초록 ▼

Methods and devices for determining whether a local version of content is stored on an electronic device associated with a user account on a backend system are described herein. In a non-limiting embodiment, the backend system may track and monitor the content stored on the electronic device using the associated user account. If an individual speaks an utterance requesting a particular content item, the backend system may determine, prior to sending the content to the electronic device, whether a local version is stored within the electronic device's memory. If so, the backend system may instruct the electronic device to output the local version, thereby reducing the amount of bandwidth consumed. The backend system may further be capable of predictively generating and then caching certain audio data to the electronic device. For instance, frequent utterances may be tracked, and likely responses to those utterances may be generated prior to the utterance being spoken so that the response is available substantially instantaneously.

대표청구항 ▼

1. A method, comprising: receiving from a first user device, at an electronic device, first audio data representing a first utterance;determining a first customer identifier associated with the first user device;determining, using the first customer identifier, a user account on the electronic device, wherein the user account is associated with the first user device;generating first text data representing the first audio data by executing speech-to-text functionality on the first audio data;determining, using the first text data, that a first intent of the first utterance is for a song to be played;determining a download history for the user account, the download history indicating content that has been downloaded from the electronic device by one or more devices associated with the user account;determining, based on the download history, that first song audio data representing the song was previously downloaded to the first user device from the electronic device;determining a first user device profile associated with the user account, the first user device profile being associated with the first user device and indicating content items that are currently stored by the first user device;determining, from the first user device profile, that the first song audio data is stored in memory by the first user device;generating a first instruction to cause the first user device to play the first song audio data;sending the first instruction to the first user device;receiving, at the electronic device, second audio data representing a second utterance that requests additional music to be played, the second audio data being received from the first user device;generating second text data representing the second audio data by executing the speech-to-text functionality on the second audio data;determining, using the second text data, that a second intent of the second utterance is for a new song to be played;determining, based on the download history, that second song audio data representing the new song is not stored within the memory;determining, based on the download history, that a second user device associated with the user account had previously downloaded the second song audio data;determining that the first user device and the second user device are capable of communicating directly with each other using a direct communications link;generating a second instruction that causes the first user device to request that the second user device send the second song audio data to the first user device using the direct communications link; andsending the second instruction to the first user device. 2. The method of claim 1, further comprising: generating, in response to determining that the first song audio data is stored in the memory, third text data representing a first audio message to introduce the song to be played;generating third audio data representing the third text data by executing text-to-speech functionality on the third text data; andsending the third audio data to the first user device such that the first audio message is played prior to the first song audio data being played. 3. The method of claim 1, further comprising: determining a number of instances with which a third utterance is received from the first user device;determining that the number is greater than a frequent utterance threshold value indicating that the third utterance is a frequent utterance;determining a response for the third utterance prior to receiving an additional instance of the third utterance from the first user device;generating third text data representing the response;generating third audio data representing the third text data by executing text-to-speech functionality on the third text data;sending the third audio data to the first user device such that the third audio data is stored within the memory;receiving, at the electronic device, fourth audio data representing a fourth utterance;generating fourth text data representing the fourth audio data by executing the speech-to-text functionality on the fourth audio data;determining, using fourth text data, that the fourth utterance is the frequent utterance;determining, based on the first user device profile, that the first user device includes the third audio data stored within the memory;generating a third instruction to cause the first user device to play the third audio data; andsending the third instruction to the first user device. 4. A method, comprising: receiving, from a first device, first audio data representing a first utterance;determining a user account associated with the first device;determining, based on first text data representing the first audio data, that a first intent of the first utterance is for first content to be output;determining, for the user account, content information associated with at least the first device;determining, based on the content information, that a first local version of the first content is stored on the first device;generating a first instruction for the first local version to be output by the first device;sending the first instruction to the first device;receiving, from the first device, second audio data representing a second utterance;determining, based on second text data representing the second audio data, that a second intent of the second utterance is for second content to be output;determining that a second device is also associated with the user account;determining, based on the content information, that a second local version of the second content is stored on the second device; anddetermining that the second device and the first device are capable of communicating using at least one short-range communications protocol. 5. The method of claim 4, further comprising: generating, prior to generating the first instruction, third text data representing a first response;generating third audio data representing the third text data; andsending the third audio data to the first device such that the first response outputs prior to the first local version. 6. The method of claim 4, further comprising: generating a second instruction that causes the second device to send the second local version to the first device using the at least one short-range communications protocol; andsending the second instruction to the first device. 7. The method of claim 4, further comprising: determining, prior to generating the first instruction, a first file size of the first content;determining that the first file size is greater than a predefined file size threshold; anddetermining that, for the user account, the first local version is to be output prior to sending a link to the first content to the first device based on the first file size being greater than the predefined file size threshold. 8. The method of claim 4, further comprising: determining frequent utterances associated with the user account;generating, prior to receiving third audio data representing one of the frequent utterances, third text data representing at least one response to the frequent utterances;generating third audio data representing the third text data; andsending the third audio data to the first device such that the at least one response is available to be output by the first device. 9. The method of claim 4, further comprising: receiving, from the first device, third audio data representing a third utterance;determining, based on third text data representing the third audio data, that a third intent of the third utterance is for third content to be output by the first device;determining, from the content information, that the first device does not include a third local version of the third content;determining that the second device is incapable of communicating with the first device using the at least one short range communications protocol;generating a link for the third content stored with a remote device; andsending the link to the first device such that the third content is output. 10. The method of claim 4, further comprising: receiving, from the first device, third audio data representing a third utterance;determining that a response is to be output, the response having a first temporal duration;determining, from the content information, that fourth audio data of the response is stored on the first device;generating a second instruction that causes the fourth audio data to be output by the first device; andsending the second instruction to the first device such that the response is output while a third intent of the third utterance is being determined. 11. An electronic device, comprising: communications circuitry operable to communicate with at least a first device;memory; andat least one processor operable to: receive, from a first device, first audio data representing a first utterance;determine a user account associated with the first device;determine, based on first text data representing the first audio data, that a first intent of the first utterance is for first content to be output;determine that a first local version of the first content is stored on the first device;generate second text data representing a first response;generate second audio data representing the second text data;generate a first instruction for the first local version to be output by the first device;send, using the communications circuitry, the first instruction and the second audio data to the first device such that the first local version is output after the second audio data;receive, from the first device, second audio data representing a second utterance;generate second text data from the second audio data by applying speech-to-text processing to the second audio data;determine, based on the second text data, that a second intent of the second utterance is for second content to be output by the first device;determine, from content information associated with at least the first device, that the first device does not include a second local version of the second content;determine that there are no additional devices associated with the user account that are capable to send content to the first device using a short-range communications protocol;generate a link between the first device and a remote device storing a third local version of the second content; andsend, using the communications circuitry, the link to the remote device such that the second content is output to the first device. 12. The electronic device of claim 11, wherein the at least one processor is further operable to: determine, using the content information, that the first local version is stored on the first device. 13. The electronic device of claim 11, wherein the at least one processor is further operable to: receive, from the first device, third audio data representing a third utterance;determine, based on third text data representing the third audio data, that a third intent of the third utterance is for third content to be output;determine that a second device is also associated with the user account;determine, based on the content information, that a third local version of the third content is stored on the second device; anddetermine, based on a first separation distance between the first device and the second device being less than a separation distance threshold, that the second device and the first device are capable of communicating using at least one short-range communications protocol. 14. The electronic device of claim 13, wherein the at least one processor is further operable to: generate a second instruction that causes the second device to send the third local version to the first device using the at least one short-range communications protocol; andsend, using the communications circuitry, the second instruction to the first device. 15. The electronic device of claim 11, wherein the at least one processor is further operable to: determine, prior to generating the first instruction, a first file size of the first content;determine that the first file size is greater than a predefined file size threshold; anddetermine that, for the user account, the first local version is to be output prior to sending a link to the first content to the first device based on the first file size being greater than the predefined file size threshold. 16. The electronic device of claim 11, wherein the at least one processor is further operable to: determine frequent utterances associated with the user account;generate, prior to receiving further audio data representing one of the frequent utterances, third text data representing at least one second response to the frequent utterances;generate third audio data representing the third text data; andsend, using the communications circuitry, the third audio data to the first device such that the at least one second response is available to be output by the first device. 17. The electronic device of claim 11, wherein the at least one processor is further operable to: receive third audio data representing a third utterance from the first device;determine that a second response is to be output, the second response having a first temporal duration;determine, from the content information, that fourth audio data of the second response is stored on the first device;generate a second instruction that causes the fourth audio data to be output by the first device; andsend, using the communications circuitry, the second instruction to the first device such that the second response is output while a third intent of the third utterance is being determined.

이 특허에 인용된 특허 (4)

Chi, Liang-Yu (Tom), Dynamic presentation of data items based on prioritized associations.
상세보기
Sone Masahiro, Household consumable item automatic replenishment system including intelligent refrigerator.
상세보기
Volkert, Christopher, Search assistant for digital media assets.
상세보기
Roberts, Linda; Nguyen, Hong Thi; Schroeter, Horst J, System and method for synthetically generated speech describing media content.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Latency reduction for content playback 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (4)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Latency reduction for content playback 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (4)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트