최소 단어 이상 선택하여야 합니다.
최대 10 단어까지만 선택 가능합니다.
다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
NTIS 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
DataON 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Edison 바로가기다음과 같은 기능을 한번의 로그인으로 사용 할 수 있습니다.
Kafe 바로가기국가/구분 | United States(US) Patent 등록 |
---|---|
국제특허분류(IPC7판) |
|
출원번호 | US-0906592 (2007-10-02) |
등록번호 | US-9053089 (2015-06-09) |
발명자 / 주소 |
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 | 피인용 횟수 : 3 인용 특허 : 517 |
Methods and apparatuses to assign part-of-speech tags to words are described. An input sequence of words is received. A global fabric of a corpus having training sequences of words may be analyzed in a vector space. A global semantic information associated with the input sequence of words may be ext
Methods and apparatuses to assign part-of-speech tags to words are described. An input sequence of words is received. A global fabric of a corpus having training sequences of words may be analyzed in a vector space. A global semantic information associated with the input sequence of words may be extracted based on the analyzing. A part-of-speech tag may be assigned to a word of the input sequence based on POS tags from pertinent words in relevant training sequences identified using the global semantic information. The input sequence may be mapped into a vector space. A neighborhood associated with the input sequence may be formed in the vector space wherein the neighborhood represents one or more training sequences that are globally relevant to the input sequence.
1. A method, comprising: analyzing a corpus having first training sequences of words in a semantic vector space; extracting a global semantic information associated with an input sequence of words from the semantic vector space;selecting second training sequences of words having part-of-speech tags
1. A method, comprising: analyzing a corpus having first training sequences of words in a semantic vector space; extracting a global semantic information associated with an input sequence of words from the semantic vector space;selecting second training sequences of words having part-of-speech tags in the semantic vector space based on the global semantic information and the first training sequences; andassigning a part-of-speech tag to at least one word of the input sequence based on the part-of-speech tags of the second training sequences, wherein at least one of the analyzing, extracting, selecting, and assigning is performed by a processor. 2. The method of claim 1, wherein the semantic vector space includes a latent semantic space. 3. The method of claim 1, wherein the analyzing comprises mapping the input sequence into the semantic vector space; andforming a neighborhood associated with the input sequence in the semantic vector space, wherein the neighborhood represents one or more second training sequences that are globally semantically relevant to the input sequence. 4. The method of claim 1, wherein the analyzing comprises determining a closeness measure between the first training sequences and the input sequence in the semantic vector space. 5. The method of claim 1, wherein the global semantic information is used to identify the second training sequences that are globally semantically relevant to the input sequence. 6. A method to assign part-of-speech tags to words, comprising: receiving an input sequence of words;mapping the input sequence into a semantic vector space, wherein the semantic vector space includes representations of a first plurality of training sequences of words; andforming a neighborhood associated with the input sequence in the semantic vector space to obtain a part-of-speech tag for at least one word of the input sequence, wherein the neighborhood represents one or more second training sequences having part-of-speech tags selected from the first plurality of training sequences that are globally semantically relevant to the input sequence in the semantic vector space wherein at least one of the receiving, mapping, and forming is performed by a processor. 7. The method of claim 6, further comprising assigning a part-of-speech tag to the at least one word of the input sequence based on the part-of-speech characteristics. 8. The method of claim 6, wherein the semantic vector space includes a latent semantic space. 9. The method of claim 6, wherein the forming the neighborhood comprises determining a closeness measure between representations of a first training sequence of the first plurality of the training sequences and the input sequence in the semantic vector space; andselecting a second training sequence out of the first plurality of the training sequences based on the closeness measure. 10. The method of claim 9, further comprising determining whether the closeness measure exceeds a predetermined threshold, and selecting the training sequence if the closeness measure exceeds the predetermined threshold. 11. The method of claim 9, further comprising ranking the training sequences according to the closeness measure; andselecting the second training sequence that has rank higher than a predetermined rank. 12. The method of claim 6, further comprising determining whether a training sequence in the neighborhood contains a first word that is similar to an input word of the input sequence;forming one or more sub-sequences of the training sequence that contain one or more first words that are similar to the input words;aligning the one or more sub-sequences to obtain one or more part-of-speech characteristics of the first words; anddetermining a part-of-speech tag for the input word based on the one or more part-of speech characteristics of the first word. 13. An article of manufacture comprising: a non-transitory machine-accessible medium including data that, when accessed by a machine, cause the machine to perform operations comprising,analyzing a corpus having first training sequences of words in a semantic vector space;extracting a global semantic information associated with an input sequence of words from the semantic vector space;selecting second training sequences of words having part-of-speech tags in the semantic vector space based on the global semantic information and the first training sequences; andassigning a part-of-speech tag to to at least one word of the input sequence based on the part-of-speech tags of the second training sequences. 14. The article of manufacture of claim 13, wherein the semantic vector space includes a latent semantic space. 15. The article of manufacture of claim 13, wherein the analyzing comprises mapping the input sequence into the semantic vector space; andforming a neighborhood associated with the input sequence in the semantic vector space, wherein the neighborhood represents one or more second training sequences that are globally semantically relevant to the input sequence. 16. The article of manufacture of claim 13, wherein the analyzing comprises determining a closeness measure between the first training sequences and the input sequence in the semantic vector space. 17. The article of manufacture of claim 13, wherein the global semantic information is used to identify the second training sequences that are globally semantically relevant to the input sequence. 18. An article of manufacture comprising: a non-transitory machine-accessible medium including data that, when accessed by a machine, cause the machine to perform operations to assign part-of-speech tags to words, comprising:receiving an input sequence of words;mapping the input sequence into a semantic vector space, wherein the semantic vector space includes representations of a first plurality of training sequences of words; andforming a neighborhood associated with the input sequence in the semantic vector space to obtain part-of-speech tag for at least one word of the input sequence, wherein the neighborhood represents one or more second training sequences having part-of-speech tags selected from the first plurality of training sequences that are globally semantically relevant to the input sequence in the semantic vector space. 19. The article of manufacture of claim 18, wherein the machine accessible medium further includes data that causes the machine to perform operations comprising, assigning a part-of-speech tag to the at least one word of the input sequence based on the part-of-speech characteristics. 20. The article of manufacture of claim 18, wherein the semantic vector space includes a latent semantic space. 21. The article of manufacture of claim 18, wherein the forming the neighborhood comprises determining a closeness measure between representations of a first training sequence of the first plurality of the training sequences and the input sequence in the semantic vector space; andselecting a second training sequence out of the first plurality of the training sequences based on the closeness measure. 22. The article of manufacture of claim 21, wherein the machine-accessible medium further includes data that causes the machine to perform operations comprising, determining whether the closeness measure exceeds a predetermined threshold, and selecting the training sequence if the closeness measure exceeds the predetermined threshold. 23. The article of manufacture of claim 21, wherein the machine-accessible medium further includes data that causes the machine to perform operations comprising, ranking the training sequences according to the closeness measure; andselecting the second training sequence that has rank higher than a predetermined rank. 24. The article of manufacture of claim 18, wherein the machine-accessible medium further includes data that causes the machine to perform operations comprising, determining whether a training sequence in the neighborhood contains a first word that is similar to an input word of the input sequence;forming one or more sub-sequences of the training sequence that contain one or more first words that are similar to the input words;aligning the one or more sub-sequences to obtain one or more part-of-speech characteristics of the first words; anddetermining a part-of-speech tag for the input word based on the one or more part-of-speech characteristics of the first word. 25. A data processing system, comprising: means for analyzing a corpus having first having training sequences of words in a semantic vector space;means for extracting a global semantic information associated with an input sequence of words from the semantic vector space;means for identifying selecting second training sequences of words having part-of-speech tags in the semantic vector space based on the global semantic information and the first training sequences; andmeans for assigning a part-of-speech tag to a-at least one word of the input sequence based on the part-of-speech tags of the second training sequences. 26. A data processing system, comprising: means for receiving an input sequence of words;means for mapping the input sequence into a semantic vector space, wherein the semantic vector space includes representations of a first plurality of training sequences of words; andmeans for forming a neighborhood associated with the input sequence in the semantic vector space to obtain a part-of-speech tag for at least one word of the input sequence, wherein the neighborhood represents one or more second training sequences having part-of-speech tags selected from the first plurality of training sequences that are globally semantically relevant to the input sequence in the semantic vector space.
Copyright KISTI. All Rights Reserved.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.