IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0079964
(2013-11-14)
|
등록번호 |
US-9247204
(2016-01-26)
|
발명자
/ 주소 |
- Yin, Li
- Zhang, Hong
- Wang, Zhe
|
출원인 / 주소 |
|
대리인 / 주소 |
Shumaker & Sieffert, P.A.
|
인용정보 |
피인용 횟수 :
3 인용 특허 :
5 |
초록
▼
In one example, a device executes one or more video communication processes that receive audio streams and video streams from a plurality of computing devices participating in a video communication session associated with the one or more video communication processes. The device evaluates one or mor
In one example, a device executes one or more video communication processes that receive audio streams and video streams from a plurality of computing devices participating in a video communication session associated with the one or more video communication processes. The device evaluates one or more properties of the audio streams, including the volume of an audio signal among the audio streams. The device selects a first group of the audio streams to mute in the video communication session, based at least in part on the one or more properties of the audio streams. The device distributes a second group of the audio streams in the video communication session, while muting the first group of audio streams in the video communication session.
대표청구항
▼
1. A method comprising: receiving, by a computing device and from a plurality of communication devices participating in a video communication session, a plurality of audio streams and one or more video streams;determining, by the computing device, whether each respective audio stream of the pluralit
1. A method comprising: receiving, by a computing device and from a plurality of communication devices participating in a video communication session, a plurality of audio streams and one or more video streams;determining, by the computing device, whether each respective audio stream of the plurality of audio streams is associated with human speech from a single speaker;determining, by the computing device, respective user roles associated with the plurality of audio streams, the respective user roles including a main presenter of the video communication session;identifying, by the computing device and based on the respective user roles, a main audio stream of the plurality of audio streams, wherein the main audio stream is associated with a respective user role of the main presenter of the video communication session;responsive to determining that one or more audio streams of the plurality of audio streams are not associated with human speech from the single speaker and that none of the one or more audio streams is the main audio stream, muting, by the computing device and in the video communication session, the one or more audio streams; andoutputting, by the computing device and in the video communication session, one or more remaining audio streams of the plurality of audio streams that are not included in the one or more audio streams. 2. The method of claim 1, further comprising: outputting at least one video stream of the one or more video streams received from the communication devices in the video communication session. 3. The method of claim 1, wherein each respective audio stream included in the plurality of the audio streams is based at least in part on audio data captured by an audio input device associated with a respective communication device of the plurality of the communication devices, andwherein each video stream of the one or more video streams is based at least in part on video data captured by a video input device associated with a respective one of the communication devices. 4. The method of claim 1, further comprising: receiving, by the computing device, an input for manipulating a document represented by document data,wherein muting the one or more audio streams is further responsive to determining that the one or more audio streams are not conveying the input for manipulating the document represented by the document data. 5. The method of claim 1, wherein determining the respective user roles associated with the plurality of audio streams comprises receiving, by the computing device, data indicating respective user roles associated with the one or more of the communication devices, andwherein muting the one or more audio streams is further responsive to determining the respective user roles. 6. The method of claim 1, further comprising: detecting, by the first computing device, a decrease in volume in at least one audio stream of the one or more remaining audio streams; andresponsive to detecting the decrease in volume in the at least one audio stream of the one or more remaining audio streams, outputting, by the computing device, at least one audio stream of the one or more audio streams that were previously muted. 7. The method of claim 6, wherein detecting the decrease in volume in the at least one audio stream of the one or more remaining audio streams comprises evaluating, by the computing device and over multiple intervals of time, an audio signal in the at least one audio stream of the one or more remaining audio streams. 8. The method of claim 7, wherein evaluating the audio signal in the at least one audio stream of the one or more remaining audio streams over the multiple intervals of time comprises evaluating the audio signal in the at least one audio stream of the one or more remaining audio streams over intervals of between one second and ten seconds inclusive. 9. The method of claim 1, further comprising: evaluating, by the computing device, the one or more video streams to determine whether each respective video stream of the one or more video streams is conveying video data that represents a human figure,wherein selecting the one or more of audio streams to mute in the video communication session is further based at least in part on determining that the one or more of the video streams are not conveying the video data that represents the human figure. 10. A computing device, comprising: a memory; andat least one processor configured to: receive, from a plurality of communication devices participating in a video communication session, a plurality of audio streams and one or more video streams;determine whether each respective audio stream of the plurality of audio streams is associated with human speech from a single speaker;determine respective user roles associated with the plurality of audio streams, the respective user roles including a main presenter of the video communication session;identify, based on the respective user roles, a main audio stream of the plurality of audio streams, wherein the main audio stream is associated with a respective user role of the main presenter of the video communication session;responsive to a determination that one or more audio streams of the plurality of audio streams are not associated with human speech from the single speaker and that none of the respective audio streams is the main audio stream, select one or more mute, in the video communication session, the one or more respective audio streams; andoutput, in the video communication session, one or more remaining audio streams of the plurality of audio streams that are not included in the audio streams. 11. The computing device of claim 10, wherein the at least one processor is further configured to: output at least one video stream of the one or more video streams received from each of the plurality of communication devices in the video communication session. 12. The computing device of claim 10, wherein each respective audio stream included in the plurality of the audio streams is based at least in part on audio data captured by an audio input device associated with a respective communication device of the plurality of the communication devices, and wherein each video stream of the one or more video streams is based at least in part on video data captured by a video input device associated with a respective one of the communication devices. 13. The computing device of claim 10, wherein the at least one processor is further configured to: receive an input for manipulating a document represented by document data,wherein the muting of the one or more audio streams is further responsive to a determination that the one or more audio streams are not conveying the input for manipulating the document represented by the document data. 14. The computing device of claim 10, wherein to determine the respective user roles associated with the plurality of audio streams, the at least one processor is configured to receive data indicating respective user roles associated with the one or more of the communication devices, and wherein, to mute the one or more respective audio streams, the at least one processor is configured to select the one or more audio streams responsive to a determination of the user roles. 15. The computing device of claim 10, wherein the at least one processor is further configured to: detect a decrease in volume in at least one audio stream of the one or more remaining audio streams; andresponsive to the detection of the decrease in volume in the at least one audio stream, output at least one audio stream of the one or more audio streams that were previously muted. 16. The computing device of claim 15, wherein, to detect the decrease in volume in the at least one audio stream of the one or more remaining audio streams, the at least one processor is configured to evaluate, over multiple intervals of time, an audio signal in the at least one audio stream of the one or more remaining audio streams. 17. The computing device of claim 16, wherein, to evaluate the audio signal in the at least one audio stream of the one or more remaining audio streams over the multiple intervals of time, the at least one processor is configured to evaluate the audio signal in the at least one audio stream of the one or more remaining audio streams over intervals of between one second and ten seconds inclusive. 18. The computing device of claim 10, wherein the at least one processor is further configured to: evaluate the one or more video streams to determine whether each respective video stream of the one or more video streams is conveying video data that represents a human figure,wherein, to select the one or more audio streams to mute in the video communication session, the at least one processor is configured to select the one or more respective audio streams further based at least in part on a determination that the one or more of the video streams are not conveying the video data that represents the human figure. 19. A non-transitory computer-readable storage medium comprising executable instructions for causing at least one processor of a computing device to perform operations comprising: receive, from a plurality of communication devices participating in a video communication session, a plurality of audio streams and one or more video streams;determining whether each respective audio stream of the plurality of audio streams is associated with human speech from a single speaker;determining respective user roles associated with the plurality of audio streams, the respective user roles including a main presenter of the video communication session;identifying, based on the respective user roles, a main audio stream of the plurality of audio streams, wherein the main audio stream is associated with a respective user role of the main presenter of the video communication session;responsive to determining that one or more audio streams of the plurality of audio streams are not associated with human speech from the single speaker and that none of the one or more audio streams is the main audio stream, muting, in the video communication session, the one or more audio streams; andoutputting, in the video communication session, one or more remaining audio streams of the plurality of audio streams that are not included in the one or more audio streams. 20. The non-transitory computer-readable storage medium of claim 19, further comprising executable instructions for causing the at least one processor of the computing device to perform operations comprising: receiving an input for manipulating a document represented by document data,wherein muting the one or more audio streams is further responsive to determining that the one or more audio streams are not conveying the input for manipulating the document represented by the document data.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.