최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
DataON 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Edison 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Kafe 바로가기국가/구분 | United States(US) Patent 등록 |
---|---|
국제특허분류(IPC7판) |
|
출원번호 | US-0160659 (2002-05-31) |
발명자 / 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 | 피인용 횟수 : 221 인용 특허 : 4 |
The present invention is directed toward a system and process that controls a group of networked electronic components using a multimodal integration scheme in which inputs from a speech recognition subsystem, gesture recognition subsystem employing a wireless pointing device and pointing analysis s
The present invention is directed toward a system and process that controls a group of networked electronic components using a multimodal integration scheme in which inputs from a speech recognition subsystem, gesture recognition subsystem employing a wireless pointing device and pointing analysis subsystem also employing the pointing device, are combined to determine what component a user wants to control and what control action is desired. In this multimodal integration scheme, the desired action concerning an electronic component is decomposed into a command and a referent pair. The referent can be identified using the pointing device to identify the component by pointing at the component or an object associated with it, by using speech recognition, or both. The command may be specified by pressing a button on the pointing device, by a gesture performed with the pointing device, by a speech recognition event, or by any combination of these inputs.
Wherefore, what is claimed is: 1. A multimodal electronic component control system comprising: an object selection subsystem; a gesture recognition subsystem; a speech control subsystem; and an integration subsystem into which the obiect selection, gesture recognition and speech control subsystems
Wherefore, what is claimed is: 1. A multimodal electronic component control system comprising: an object selection subsystem; a gesture recognition subsystem; a speech control subsystem; and an integration subsystem into which the obiect selection, gesture recognition and speech control subsystems provide inputs, said integration subsystem integrating said inputs to arrive at a unified interpretation of what component a user wants to control and what control action is desired, and wherein the integration subsystem comprises, a dynamic Bayes network which determines from the individual inputs of the obiect selection, gesture recognition, and speech control subsystems, the identity of a component the user wants to control (i.e., the referent), a command that the user wishes to implement (i.e., the command), and the appropriate control action to be taken to affect the identified referent in view of the command, said dynamic Bayes network comprising input, referent, command and action nodes, wherein the input nodes include said individual inputs which provide information as to their state to at least one of a referent, command, or action node, said inputs determining the state of the referent and command nodes, and wherein the states of the referent and command nodes are fed into an action node whose state indicates the action that is to be implemented to affect the referent, and wherein said referent, command and action node states comprise probability distributions indicating the probability that each possible referent, command and action is the respective referent, command and action. 2. The system of claim 1, wherein the object selection subsystem is pointer based in that an electronic component is selected by a user pointing a pointing device at an object that corresponds to, or is associated with, the electronic component, and wherein the input nodes providing information to the referent node comprise a pointing target node which indicates to the referent node what electronic component is associated with the object that the user is currently pointing at. 3. The system of claim 1, wherein the speech control subsystem identifies an electronic component corresponding to a prescribed component name spoken by a user, and wherein the input nodes providing information to the referent node comprise a speech referent node which indicates to the referent node what electronic component the user currently identified by speaking the prescribed name for that component. 4. The system of claim 1, wherein the state of all nodes of the dynamic Bayes network are updated on a prescribed periodic basis to incorporate new information, and wherein the input nodes providing information to the referent node comprise a prior state node whose state reflects the referent node's state in the update period immediately preceding a current period, and wherein the input from the prior state node is weighted such that its influence on the state of the referent node decreases in proportion to the amount of time that has past since the prior state node first acquired the state being input. 5. The system of claim 1, wherein the object selection subsystem is pointer based in that an electronic component is selected by a user pointing a pointing device at an object that corresponds to, or is associated with, the electronic component, and wherein the input nodes providing information to the command node comprise a button click node which indicates to the command node that a user-operated switch resident on the pointer of the object selection subsystem has been activated. 6. The system of claim 1, wherein the gesture recognition subsystem identifies a command indicative of a desired action for affecting an electronic component which corresponds to a prescribed gesture performed by the user with a pointing device, and wherein the input nodes providing information to the command node comprise a gesture node which indicates to the command mode what electronic component the user currently identified by performing the gesture associated with that that component. 7. The system of claim 1, wherein the speech control subsystem identifies a command indicative of a desired action for affecting an electronic component corresponding to a prescribed command word or phrase spoken by a user, and wherein the input nodes providing information to the command node comprise a speech command node which indicates to the command node what command the user currently identified by speaking the prescribed word or phrase for that command. 8. The system of claim 1, wherein the state of all nodes of the dynamic Bayes network are updated on a prescribed periodic basis to incorporate new information, and wherein the input nodes providing information to the command node comprise a prior state node whose state reflects the command node's state in the update period immediately preceding a current period, and wherein the input from the prior state node is weighted such that its influence on the state of the command node decreases in proportion to the amount of time that has past since the prior state node first acquired the state being input. 9. The system of claim 1, wherein input nodes providing information to the action node comprise one or more device state nodes whose state is set to reflect the current condition of an electronic component associated with the node, and wherein each device state node provides information as to the current state of the associated component to the action node whenever the referent node probability distribution indicates the referent is the component associated with the device state node. 10. A computer-implemented multimodal electronic component control process comprising: a pointer-based obiect selection process module; a gesture recognition process module; a speech control process module; and a dynamic Bayes network into which the obiect selection, gesture recognition and speech control process modules provide inputs, said dynamic Bayes network integrating the inputs to arrive at a unified interpretation of what component a user wants to control and what control action is desired by determining from the individual inputs of the obiect selection, gesture recognition and speech control process modules, the identity of a component the user wants to control (i.e., the referent), a command that the user wishes to implement (i.e., the command), and the appropriate control action to be taken to affect the identified referent in view of the command, wherein the dynamic Bayes network has a process flow architecture comprising a series of input nodes including said individual inputs which provide information as to their state to at least one of a referent node, a command node and an action node, said inputs determining the state of the referent and command nodes, and wherein the state of the referent and command node is fed into an action node whose state is determined by the input from the referent, command and input nodes, and whose state indicates the action that is to be implemented to affect the referent, and wherein said referent, command and action node states comprise probability distributions indicating the probability that each possible referent, command and action is the respective referent, command and action. 11. The process of claim 10, wherein the state of all nodes of the Bayes network are updated on a prescribed periodic basis to incorporate new information, and wherein said input nodes further comprise nodes whose state reflects the same node's state in the update period immediately preceding a current period, wherein the prior node state is input into the same node in the current period thereby allowing the prior states of the nodes to influence the node states in the current period so to preserve prior incomplete or ambiguous inputs until enough information is available to the network to determine the state of an undecided node. 12. The process of claim 11, wherein the input from an input node associated with a prior node state is weighted such that its influence on the state of the corresponding node in the current period decreases in proportion to the amount of time that has past since the node first acquired the state being input. 13. The process of claim 10, wherein said input nodes further comprise one or more device state nodes whose state is set to reflect the current condition of an electronic component associated with the node, and wherein each device state node provides information as to the current state of the associated component to the action node whenever the referent node probability distribution indicates the referent is the component associated with the device state node. 14. The process of claim 13, wherein the device state nodes are set to indicate whether the associated component is activated or deactivated. 15. The process of claim 14, wherein only two action node states are possible for a particular electronic component in that when the component is activated the only action is to deactivate it, and when the component is deactivated the only action is to activate it, and wherein whenever a device state node provides information as to the current state of the associated component as to whether the component is activated or deactivated to the action node, the action node is set to a state indicating the component is to be activated if it is currently deactivated or a state indicating the component is to be deactivated if it is currently activated, regardless of the state of the command node. 16. A computer-implemented process for controlling a user-selected electronic component within an environment using a pointing device, comprising using a computer to perform the following process actions: computing a similarity between an input sequence of sensor values output by the pointing device and recorded over a prescribed period of time and at least one stored prototype sequence, wherein each prototype sequence represents the sequence of said sensor values that are generated if the user performs a unique gesture representing a different control action for the selected electronic component using the pointing device; determining if the computed similarity between the input sequence and any prototype sequence exceeds a prescribed similarity threshold; and whenever it is determined that one of the computed similarities exceeds the similarity threshold, implementing a command represented by the gesture. 17. The process of claim 16, wherein the prescribed period of time in which the sensor values output by the pointing device are recorded is long enough to ensure that characteristics evidenced by the sensor values which distinguish each gesture from any other gesture representing a different control action are captured in the input sequence. 18. The process of claim 17, wherein the prescribed period of time in which the sensor values output by the pointing device are recorded is within a range of approximately 1 second to approximately 2 seconds. 19. The process of claim 16, the process action of computing the similarity between the sequence of sensor values output by the pointing device and recorded over the prescribed period of time and a prototype sequence, comprises an action of computing the similarity using a squared Euclidean distance technique. 20. The process of claim 16, wherein the prescribed similarity threshold is large enough to ensure that characteristics in a prototype sequence as the evidenced by the sensor values thereof that distinguish the gesture associated with the prototype sequence from any other gesture represented by a different prototype sequence, exist in the input sequence, thereby causing the similarity computed between the input sequence and the prototype sequence to exceed the threshold, and causing the similarity computed between the input sequence and any other prototype sequence associated with a different gesture to not exceed the threshold. 21. The process of claim 16, wherein the process is performed continuously for each consecutive portion of the inputted sequence of sensor values output by the pointing device having the prescribed period of time for as long as the electronic component remains selected. 22. The process of claim 16, wherein each gesture is a series of movements of the pointing device in arbitrary directions, and wherein the sensor output or outputs recorded are those that are sure to characterize the motion of the pointing device associated with the gesture. 23. The process of claim 22, wherein the pointing device sensors comprise a magnetometer which has outputs that characterize the movement of the pointer in pitch, roll and yaw, and are used exclusively as the sensor outputs that are recorded. 24. The process of claim 22, wherein the pointing device sensors comprise an accelerometer which has outputs that characterize the movement of the pointer in pitch and roll only, and a gyroscope which has outputs that characterize the movement of the pointer in yaw only, and the combination of the outputs from these two sensors are used as the sensor outputs that are recorded. 25. The process of claim 24, wherein the pointing device sensors comprise a magnetometer which has outputs that characterize the movement of the pointer in pitch, roll and yaw, and wherein the outputs of the magnetometer are used in conjunction with the outputs of the accelerometer and gyroscope as the sensor outputs that are recorded. 26. A system for controlling a user-selected electronic component within an environment using a pointing device, comprising: a general purpose computing device; a computer program comprising program modules executable by the computing device, wherein the computing device is directed by the program modules of the computer program to, input orientation messages transmitted by the pointing device, said orientation messages comprising orientation sensor readings generated by orientation sensors of the pointing device, generate and store at least one prototype sequence in a training phase, each prototype sequence comprising prescribed one or ones of the orientation readings that are generated by the pointing device during the time a user moves the pointing device in a gesture representing a particular command for controlling the selected electronic component, wherein each prototype sequence is associated with a different gesture and a different command, record said prescribed one or ones of the orientation sensor readings from each orientation message inputted for a prescribed period of time to create an input sequence of sensor readings, for each stored prototype sequence, compute a similarity indicator between the input sequence and each prototype sequence under consideration, wherein the similarity indicator is a measure the similarity between sequences, identify the largest of the computed similarity indicators, determine if identified largest computed similarity indicator exceeds a prescribed similarity threshold, whenever the identified similarity indicator exceeds the prescribed similarity threshold, designate that the user has performed the gesture corresponding to the prototype sequence associated with that similarity indicator, and implement the command represented by the gesture that the user was designated to have performed. 27. The system of claim 26, wherein the program module for generating and storing at least one prototype sequence in a training phase, comprises, for each prototype sequence generated, the sub-modules for: (a) inputting a user-specified identity of the electronic component that the gesture associated with the prototype sequence applies to; (b) inputting a user-specified control action that is to be taken in regard to the identified electronic component in response to the user performing the gesture; (c) causing periodic requests for orientation messages to be sent to the pointing device; (d) waiting for an orientation message to be received; (e) whenever an orientation message is received, determining whether the switch state indicates that the pointing device's switch has been activated by the user to indicate that the gesture is being performed; (f) whenever it is determined that the switch state does not indicate that the pointing devices switch is activated, repeat actions (d) and (e); (g) whenever it is determined that the switch state does indicate that the pointing devices switch is activated, record a prescribed one or ones of the pointing device sensor outputs taken from the last received orientation message; (h) waiting for the next orientation message to be received; (i) whenever an orientation message is received, determining whether the switch state indicates that the pointing devices switch is still activated, and if so repeating actions (g) and (h); (j) whenever it is determined that the switch state no longer indicates that the pointing device's switch is activated thereby indicating that the user has completed performing the gesture, designating the recorded sensor outputs in the order recorded as the prototype sequence; and (k) storing the prototype sequence. 28. A computer-implemented process for controlling a user-selected electronic component within an environment using a pointing device, comprising using a computer to perform the following process actions: inputting orientation messages transmitted by the pointing device, said orientation messages comprising orientation sensor readings generated by orientation sensors of the pointing device; recording a prescribed one or ones of the pointing device sensor outputs taken from an orientation message inputted whenever a user indicates performing a gesture representing a control action for the selected electronic component; for each gesture threshold definition assigned to the selected electronic component, determining whether the threshold if just one, or all the thresholds if more than one, of the gesture threshold definition under consideration are exceeded by the recorded sensor output associated with the same sensor output as the threshold; whenever it is determined that the threshold if just one, or all the thresholds if more than one, of one of the gesture threshold definitions are exceeded by the recorded sensor output associated with the same sensor output, designating that the user has performed the gesture associated with that gesture threshold definition; and implementing the command represented by the gesture that the user was designated to have performed. 29. The process of claim 28, wherein the process actions of recording the prescribed one or ones of the pointing device sensor outputs, comprises the actions of: (a) waiting for an orientation message to be received; (b) whenever an orientation message is received, determining whether the switch state indicates that the pointing device's switch has been activated by the user to indicate that a control gesture is being performed; (c) whenever it is determined that the switch state does not indicate that the pointing devices switch is activated, repeat actions (a) and (b); (d) whenever it is determined that the switch state does indicate that the pointing devices switch is activated, recording the prescribed one or ones of the pointing device sensor outputs taken from the last received orientation message. 30. The process of claim 28, wherein each gesture is a movement of the pointer in a single prescribed direction, and wherein the sensor output or outputs included in each gesture threshold definition and so the recorded sensor outputs are those that characterize the motion of the pointing device in the prescribed direction associated with the gesture. 31. The process of claim 30, wherein the pointing device sensors comprise a magnetometer which has outputs that characterize the movement of the pointer in pitch, roll and yaw. 32. The process of claim 30, wherein the pointing device sensors comprise an accelerometer which has outputs that characterize the movement of the pointer in pitch and roll. 33. The process of claim 30, wherein the pointing device sensors comprise a gyroscope which has outputs that characterize the movement of the pointer in yaw only. 34. The process of claim 28, wherein prior to performing the process action of determining if the threshold or thresholds of each gesture threshold definition assigned to the selected electronic component are exceeded, performing a process action of establishing for each gesture representing a control action for the selected electronic component a gesture threshold definition, each of said gesture threshold definitions comprising either (i) a prescribed threshold applicable to a particular single sensor output or (ii) a set of thresholds applicable to a particular group of sensor outputs, which are indicative of the pointing device being moved in a particular direction from a starting point wherein the pointing device is pointed at the object in the environment corresponding to or associated with the selected electronic component. 35. A system for controlling electronic components within an environment using a pointing device, comprising: a pointing device comprising a transceiver and orientation sensors, wherein the outputs of the sensors are periodically packaged as orientation messages and transmitted using the transceiver; a base station comprising a transceiver which receives orientation messages transmitted by the pointing device; a pair of imaging devices each of which is located so as to capture images of the environment from different viewpoints; a computing device which is in communication with the base station and the imaging devices so as to receive orientation messages forwarded to it by the base station and images captured by the imaging devices, and which, computes the orientation and location of the pointer from the received orientation message and captured images, determines if the pointing device is pointing at an object in the environment which corresponds to, or is associated with, an electronic component that is controllable by the computing device using the orientation and location of the pointing device, and if so selects the electronic component, affects the selected electronic component in accordance with a command received from the pointing device. 36. The system of claim 35, wherein the pointing device further comprises a manually-operated switch whose state with regard to whether it is activated or deactivated at the time an orientation message is packaged for transmission is included in that orientation message, and wherein said command from the pointing device comprises a switch state which indicates that the switch is activated. 37. The system of claim 36, wherein the selected electronic component can be activated or deactivated by the computing device, and wherein a switch state indicating that the pointing device's switch is activated received in an orientation message transmitted by the pointing device after the electronic component is selected is interpreted by the computing device as a command to one of (i) activate the selected component if the component is deactivated, or (ii) deactivate the component if the component is activated. 38. The system of claim 35, wherein the pointing device further comprises a pair of visible spectrum light emitting diodes (LEDs) which are visible when lit from outside of the pointer, said visible spectrum LEDs being employ to provide status or feedback information to the user. 39. The system of claim 38, wherein the computing device instructs the pointing device to light one or both of the visible spectrum LEDs by causing the base station to transmit an command to the pointing device instructing it to light one or both of the visible spectrum LEDs. 40. The system of claim 39, wherein the computing device instructs the pointing device to light one of the visible spectrum LEDs whenever the pointer is pointing at an object in the environment which corresponds to, or is associated with, an electronic component that is controllable by the computing device. 41. The system of claim 39, wherein the computing device instructs the pointing device to light a first one of the visible spectrum LEDs whenever the pointer is pointing at an object in the environment which corresponds to, or is associated with, an electronic component that is activated, and to light the other visible spectrum LEDs whenever the pointer is pointing at an object in the environment which corresponds to, or is associated with, an electronic component that is deactivated. 42. The system of claim 39, wherein the computing device instructs the pointing device to light one of the visible spectrum LEDs whenever the pointer is pointing at an object in the environment which corresponds to, or is associated with, an audio device that is controllable by the computing device, and to vary the intensity of that LED in proportion to the volume setting of the audio device. 43. The system of claim 39, wherein the computing device emits a user-audible sound whenever the pointer is pointing at an object in the environment which corresponds to, or is associated with, an electronic component that is controllable by the computing device.
Copyright KISTI. All Rights Reserved.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.