IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0788639
(2004-02-27)
|
등록번호 |
US-7706616
(2010-05-20)
|
발명자
/ 주소 |
- Kristensson, Per-Ola
- Wang, Jingtao
- Zhai, Shumin
|
출원인 / 주소 |
- International Business Machines Corporation
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
132 인용 특허 :
7 |
초록
▼
A word pattern recognition system based on a virtual keyboard layout combines handwriting recognition with a virtual, graphical, or on-screen keyboard to provide a text input method with relative ease of use. The system allows the user to input text quickly with little or no visual attention from th
A word pattern recognition system based on a virtual keyboard layout combines handwriting recognition with a virtual, graphical, or on-screen keyboard to provide a text input method with relative ease of use. The system allows the user to input text quickly with little or no visual attention from the user. The system supports a very large vocabulary of gesture templates in a lexicon, including practically all words needed for a particular user. In addition, the system utilizes various techniques and methods to achieve reliable recognition of a very large gesture vocabulary. Further, the system provides feedback and display methods to help the user effectively use and learn shorthand gestures for words. Word patterns are recognized independent of gesture scale and location. The present system uses language rules to recognize and connect suffixes with a preceding word, allowing users to break complex words into easily remembered segments.
대표청구항
▼
What is claimed is: 1. A method of recognizing words, comprising: accepting a stroke as an input on a virtual keyboard coupled to a computer, the computer programmed to perform the steps of: defining word patterns of a plurality of known words by a plurality of paths, wherein each path connects ele
What is claimed is: 1. A method of recognizing words, comprising: accepting a stroke as an input on a virtual keyboard coupled to a computer, the computer programmed to perform the steps of: defining word patterns of a plurality of known words by a plurality of paths, wherein each path connects elements in the known word on the virtual keyboard, wherein the virtual keyboard comprises virtual keys, each virtual key representing a letter in a word without a temporary target letter being placed adjacent to a location of a stroke; processing the stroke using a combination of a plurality of channels, each channel selectively measuring a different aspect of a similarity of the stroke to a plurality of possible paths on the virtual keyboard; converting each different aspect of the stroke's similarity to probability estimates; a shape channel of the plurality of channels measuring a shape aspect of the stroke, and outputting a probability estimate; a location channel of the plurality of channels measuring location aspect of the stroke, and outputting a probability estimate, wherein the location channel measures the location aspect of the stroke concurrently with the shape channel measuring the shape aspect of the stroke; mathematically integrating, using Bayes' theorem, the probability estimates of the plurality of channels to produce integrated probability estimates of candidate words corresponding to the stroke; and based on the integrated probability estimates of the candidate words, recognizing the stroke as a known word. 2. The method of claim 1, wherein the shape channel outputs normalized shape information independent of location and scale. 3. The method of claim 1, wherein the plurality of channels comprises a tunnel model channel, wherein the tunnel of a word pattern comprises a predetermined width on either side of a set of the virtual keys representing a set of letters of a word on the virtual keyboard, and wherein the tunnel model channel is applied to the stroke before any other channel is applied to the stroke. 4. The method of claim 1, wherein the plurality of channels comprises a language context channel that stores recognized known words, and wherein the language context channel provides clues for recognizing a word based on a stored previously recognized known word. 5. The method of claim 2, wherein recognizing a word pattern using the normalized shape information comprises at least one of: template matching and feature extraction. 6. The method of claim 1, wherein the location channel recognizes a word pattern by sampling a plurality of points on the stroke, including at least one sampling point between a beginning and an end of the stroke, each sampling point having a weight, and by applying weights to sampling points of the stroke, wherein each sampling point has a different weight, and wherein a sampling point at the beginning of the stroke has a greatest weight and a sampling point at the end of the stroke has a least weight. 7. The method of claim 3, wherein recognizing a word pattern using the tunnel model channel comprises traversing keys passed by the word pattern and identifying potential word candidates by partial string matching. 8. The method of claim 3, wherein recognizing a word pattern using the tunnel model channel comprises transforming a tunnel and the stroke passing the tunnel. 9. The method of claim 2, further comprising: generating an intermediate shape that represents a difference between the stroke and an ideal template of the word pattern by morphing the stroke with an ideal template; and displaying the intermediate shape. 10. The method of claim 1, further comprising analyzing the stroke to differentiate between a tapping and a shorthand gesture input; and inputting at least one letter of a word by tapping the letter. 11. The method of claim 2, further comprising: comparing a normalized word pattern and a normalized stroke; sampling the normalized word pattern to a fixed number of a plurality of points; sampling the stroke to a same fixed number of a plurality of points; and measuring the plurality of points relative to each other. 12. The method of claim 9, further comprising: comparing a feature vector of the stroke with a feature vector of the ideal template of the word pattern; computing a similarity score from said comparing; and obtaining a distance measurement between the stroke and the ideal template of the word pattern from said similarity score. 13. A shorthand symbol system for recognizing words, comprising: a graphical keyboard layout for accepting a stroke as an input, wherein the keyboard layout contains a set of characters forming elements in the word without a temporary target element being placed adjacent to a current stroke location; a storage for storing word patterns of a plurality of paths, wherein each path connects a set of letters received from the graphical keyboard layout; a pattern recognition engine that recognizes a word pattern by processing the stroke using a combination of a plurality of channels, each channel selectively processing, in parallel, a different aspect of the stroke in relation to the plurality of the paths on the graphical keyboard layout and producing an output representing a probability estimate for a candidate word, one channel of the plurality of channels processing a location-based similarity probability estimate, another channel of the plurality of channels processing a shape-based similarity probability estimate, still another channel of the plurality of channels processing a path-based similarity probability estimate, and yet another channel of the plurality of channels processing a language context-based similarity probability estimate; and a computer for producing a probability estimate of a candidate word, wherein the computer produces the probability estimate of the candidate word by first serially applying the output of each channel of the plurality of channels alone and separately, and if a recognized word cannot be identified from the output of any one channel, then the computer mathematically integrates outputs of at least two channels of the plurality of channels to produce an integrated probability estimate of the candidate word. 14. The method of claim 13, wherein the channel that processes a shape-based similarity probability estimate outputs normalized shape information independent of location and scale. 15. The method of claim 13, wherein the one channel of the plurality of channels comprises location information regarding sampling points of the stroke, wherein each sampling point has a different weight. 16. The system of claim 13, wherein the word patterns comprise letters from an alphabet. 17. The system of claim 13, wherein the word patterns comprise letters from Chinese pinyin characters. 18. The system of claim 13, wherein the word patterns are based on a lexicon, and wherein the lexicon comprises a very large collection of words used in a natural language, and wherein words in the lexicon are rank ordered by usage frequency, and more frequent words are given higher a priori probability. 19. The system of claim 13, wherein the word patterns are based on a lexicon, wherein the lexicon is customized from an individual user's previous documents for a specific application, and wherein part of the customized lexicon is based on a computer programming language. 20. The system of claim 13, wherein the word patterns are based on a lexicon, and wherein the lexicon is customized for a specific domain. 21. The method of claim 1, including ranking the candidate words in order of probability. 22. The method of claim 1, including: determining a time spent inputting the stroke; and modifying at least one probability estimate according to a path of the stroke on the virtual keyboard and the time spent inputting the stroke, to produce an output of at least one channel of the plurality of channels. 23. A method of recognizing words, comprising: using a computer to perform the steps of: defining word patterns of a plurality of known words by a plurality of paths, wherein each path connects elements in the known word on a virtual keyboard, wherein the virtual keyboard comprises virtual keys, each virtual key representing a character, each character forming an element in a word; accepting a stroke as a candidate word inputted on the virtual keyboard; recognizing a word pattern by processing the stroke using a combination of a plurality of channels, each channel selectively processing a different aspect of the stroke, one channel of the plurality of channels determining a weighted location-based similarity probability estimate from a location-based similarity probability estimate; determining a time spent inputting the stroke; modifying the weighted location-based similarity probability estimate according to a path of the stroke on the virtual keyboard and the time spent inputting the stroke, to produce an output of the one channel, wherein modifying further comprises; calculating a total normative time of inputting the stroke for each word i, as follows: t n ( i ) = na + b ∑ k = 1 n - 1 log 2 ( D k , k + 1 W + 1 ) where Dk,k+1 is a distance between the kth and the (k+1)th letters of word i on the keyboard; W is a key width, n is a number of letters in the word; and a and b are two constants in Fitts' law, calculating a total normative time of inputting the stroke for all words of a gesture production, as follows: ta=Σtn(i) and if ta≦tn(i), then a ratio tn(i)/ta is used to adjust distribution of the probability estimates so as to lower the weight of the location channel; and mathematically integrating outputs of the plurality of channels to produce an integrated probability estimate of the candidate word. 24. The method of claim 23, including the step of ranking the candidate words in order of probability. 25. The method of claim 23, wherein another channel of the plurality of channels comprise a tunnel model channel. 26. The method of claim 25, wherein still another channel of the plurality of channels comprise a language context channel. 27. The method of claim 26, wherein yet another channel of the plurality of channels comprises shape information. 28. The method of claim 27, wherein recognizing a word pattern using the shape information comprises template matching. 29. The method of claim 28, wherein recognizing a word pattern using the shape information comprises feature extraction. 30. The method of claim 1, including, prior to mathematically integrating, the steps of: setting a threshold for probability estimates; and pruning words whose probability estimates are lower than the threshold. 31. The method of claim 1, wherein the shape channel outputs for each word a probability estimate xS(i), and wherein the location channel outputs for each word a probability estimate xL(i), and including, for each word, the step of: mathematically integrating the probability estimates of the plurality of channels first by producing for each channel a score y(i)∈[0,1]:y(i)=e−x(i)/θ, where y is a variable between 0 and 1, and where θ is a weighting coefficient, and second by adding yS(i); and yL(i) such that the sum is an integrated probability estimate y(i) of a candidate word. 32. The method of claim 31, including pruning all scores y(i)<0.04 prior to adding yS(i) and yL(i). 33. The method of claim 31, including the steps of: setting a threshold for probability estimates; pruning words whose probability estimates are lower than the threshold; calculating a probability p of a word i based on values of xS and xL provided by the shape channel and the location channel, respectively, for those candidate words for the stroke that have not been pruned (i∈W) as follows, p ( i ) = y ( i ) ∑ i ∈ W y ( i ) and after calculating pS from the shape channel and pL from the location channel, integrating candidate words from the shape channel and the location channel as follows, p ( i ) = p s ( i ) p l ( i ) ∑ j ∈ W s ⋂ W l p s ( j ) p l ( j ) where pS is the probability estimate from the shape channel, pL is the probability estimate from the location channel, WS is a set of candidate words outputted from the shape channel, and WL is a set of candidate words outputted from the shape channel. 34. The method of claim 1, wherein the location channel places varied weights on different sampling points of the stroke as follows: x L ( i ) = ∑ k = 0 N α ( k ) d 2 ( k ) where α(k) is a relative weight placed on a kth sampling point of the stroke (from k=0 to k=N), where d2 is a distance between a point of the stroke and a corresponding point of a template, and where xL(i) is a probability estimate that the stroke is word i intended by a user who makes the stroke. 35. The method of claim 34, wherein α(k) is calculated as follows: α ( k ) = N - k ( 1 - β ) ( 1 + N ) N - ( 1 - β ) ∑ k = 1 N k where β = α ( 0 ) / α ( N ) , and where α ( 0 ) + α ( 1 ) + α ( 2 ) + … + α ( k ) + … + α ( N ) = 1. 36. The method of claim 3, wherein the tunnel model channel includes two stages, a first stage generates a small candidate list efficiently by partial string matching, and a second stage verifies the word candidate by checking whether all of the points in a stroke falls into a virtual tunnel. 37. The method of claim 36, wherein the first stage of the tunnel model channel includes a first step that sequentially compares all the characters in the stroke with all the characters in templates in a lexicon to match all the characters in the known word, a second step in which captured points from the stroke are translated to raw trace characters one by one, according to the virtual button at which the point is located, a third step in which adjacent duplicate characters as well as none alphabetic characters in the raw trace characters are removed to generate a trace string, and a fourth step in which each of the words in the lexicon is matched with the trace string to verify whether all the characters of a word appears in the trace string sequentially, and if so, the tunnel model channel saves that word in a word candidate list. 38. The method of claim 37, wherein the second stage of the tunnel model channel includes a first step that constructs a corresponding tunnel model for each word in the word candidate list, a second step in which all the words in the word candidate list are tested to determine whether the stroke exceed the tunnel boundaries of a corresponding known word, and a third step in which each point in the stroke is checked to determine whether it stays within a word tunnel, and if so, that word is selected as an output of the tunnel channel model. 39. The method of claim 23, wherein the step of recognizing includes determining a weighted location-based similarity probability estimate y(i) from a location-based similarity probability estimate x(i), as follows y(i)∈[0,1]:y(i)=e−x(i)θ, where y is a variable between 0 and 1, and where θ is a weighting coefficient.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.