[특허]System and method of pattern recognition in very high-dimensional space

System and method of pattern recognition in very high-dimensional space 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G10L-015/06 G10L-015/00 G10L-015/04
출원번호	US-0617834 (2006-12-29)
등록번호	US-7369993 (2008-05-06)
발명자 / 주소	Atal,Bishnu Saroop
출원인 / 주소	AT&T Corp.
인용정보	피인용 횟수 : 1 인용 특허 : 22

초록 ▼

A system and method of recognizing speech comprises an audio receiving element and a computer server. The audio receiving element and the computer server perform the process steps of the method. The method involves training a stored set of phonemes by converting them into n-dimensional space, where n is a relatively large number. Once the stored phonemes are converted, they are transformed using single value decomposition to conform the data generally into a hypersphere. The received phonemes from the audio-receiving element are also converted into n-dimensional space and transformed using single value decomposition to conform the data into a hypersphere. The method compares the transformed received phoneme to each transformed stored phoneme by comparing a first distance from a center of the hypersphere to a point associated with the transformed received phoneme and a second distance from the center of the hypersphere to a point associated with the respective transformed stored phoneme.

대표청구항 ▼

The invention claimed is: 1. A method of training phonemes for use in recognizing a received phoneme having an associated received-signal vector using a stored plurality of phoneme classes, each of the plurality of phoneme classes comprising class phonemes, the method comprising, for each class phoneme: generating an expanded stored-phoneme vector from the class phoneme; and transforming the expanded stored-phoneme vector into an orthogonal form associated with a hypersphere having a center and a radius, wherein a received phoneme may be recognized by generating an expanded received-signal vector into an orthogonal form for analysis in the hypersphere. 2. The method of claim 1, wherein the generating expanded stored phoneme vector from the class phoneme further comprises: determining a phoneme vector as a time-frequency representation of the class phoneme; dividing the phoneme vector into phoneme segments; assigning each phoneme segment into a plurality of phoneme parameters; and expanding each phoneme segment and plurality of phoneme parameters into an expanded stored-phoneme vector with expanded vector parameters. 3. The method of claim 1, wherein transforming the expanded stored-phoneme vector into an orthogonal form further comprises: setting [x1 x2 . . . xm]=[u1 u2 . . . um] ΛVt, where xk is a kth acoustic vector for a corresponding stored phoneme, uk is the corresponding orthogonal vector and Λ and V are diagonal and unitary matrices, respectively. 4. The method of claim 1, further comprising transforming the expanded received signal vector which is associated with the received phoneme into an orthogonal form using a singular-value decomposition to conform the expanded received-single signal vector into the hypersphere. 5. A method of recognizing a received phoneme having an associated received-signal vector using a plurality of phoneme classes, each of the plurality of phoneme classes comprising class phonemes, the method comprising: generating an expanded received-signal vector from a received analog acoustic signal; transforming the expanded received-signal vector into an orthogonal form associated with a hypersphere having a center and a radius; determining a first distance associated with the orthogonal form of the expanded received-signal vector and a second distance associated respectfully with each orthogonal form of expanded stored phoneme vectors; and recognizing the received phoneme according to a comparison of the first distance with the second distance. 6. The method of claim 5, wherein generating the expanded received-signal vector further comprises: receiving the analog acoustic signal; converting the analog acoustic signal into a digital signal; determining the received-signal vector as a time-frequency representation of the received digital signal; dividing the received-signal vector into received-signal segments; assigning each received-signal segment into a plurality of received-signal parameters; and expanding each received-signal segment and plurality of received-signal parameters into an expanded received-signal vector. 7. The method of claim 5, wherein transforming the expanded received-signal vector into an orthogonal form further comprises: setting [yk]=[zk] ΛVt, where yk is a kth acoustic vector for a corresponding received phoneme, zk is the corresponding orthogonal vector and Λ and V are diagonal and unitary matrices, respectively. 8. The method of claim 5, wherein transforming the expanded stored phoneme vector into an orthogonal form uses singular-value decomposition and wherein transforming the expanded received-signal vector into an orthogonal form using singular-value decomposition further conforms the stored-phoneme vector into the hypersphere. 9. The method of claim 8, wherein determining a distance associated with the orthogonal form of the expanded received-signal vector and each orthogonal form of the expanded stored-phoneme vectors further comprises: comparing a distance from the center of the hypersphere of the orthogonal form of the expanded received-signal vector with a distance from the center of the hypersphere for each orthogonal form of the expanded stored-phoneme vector. 10. The method of claim 9, wherein determining a distance associated with the orthogonal form of the expanded received-signal vector and each orthogonal form of the expanded stored-phoneme vectors further comprises: determining a difference between the distance from the center of the hypersphere of the orthogonal form of the expanded received-signal vector and the distance from the center of the hypersphere for each orthogonal form of the expanded stored-phoneme vectors, wherein the expanded stored-phoneme vectors associated with m-shortest differences between the distance from the center of the hypersphere of the orthogonal form of the expanded received-signal vector and the distance from the center of the hypersphere for each orthogonal form of the expanded stored-phoneme vectors are recognized as most likely to be associated with the received phoneme. 11. A computing device for recognizing a received phoneme having an associated received-signal vector using a stored plurality of phoneme classes, each of the plurality of phoneme classes comprising class phonemes, the computing device comprising: a module configured to generate an expanded stored-phoneme vector from each respective class phoneme; and a module configured to transform the expanded stored-phoneme vector into an orthogonal form associated with a hypersphere having a center and a radius, wherein a received phoneme may be recognized by generating an expanded received-signal vector into an orthogonal form for analysis in the hypersphere. 12. The computing device of claim 11, wherein the module configured to generate the expanded stored phoneme vector from the class phoneme further: determines the phoneme vector as a time-frequency representation of the class phoneme; divides the phoneme vector into phoneme segments; assigns each phoneme segment into a plurality of phoneme parameters; and expands each phoneme segment and plurality of phoneme parameters into an expanded stored-phoneme vector with expanded vector parameters. 13. The computing device of claim 11, wherein the module configured to transform the expanded stored-phoneme vector into an orthogonal form further: sets [x1 x2 . . . xm]=[u1 u2 . . . um] ΛVt, where xk is a kth acoustic vector for a corresponding stored phoneme, uk is the corresponding orthogonal vector and Λ and V are diagonal and unitary matrices, respectively. 14. The computing device of claim 11, further comprising a module configured to transform the expanded received signal vector which is associated with the received phoneme into an orthogonal form using a singular-value decomposition to conform the expanded received-single signal vector into the hypersphere. 15. A computing device for recognizing a received phoneme having an associated received-signal vector using a plurality of phoneme classes, each of the plurality of phoneme classes comprising class phonemes, the computing device comprising: a module configured to generate an expanded received-signal vector from a received analog acoustic signal; a module configured to transform the expanded received-signal vector into an orthogonal form associated with a hypersphere having a center and a radius; a module configured to determine a first distance associated with the orthogonal form of the expanded received-signal vector and a second distance associated with each orthogonal form of expanded stored phoneme vectors; and a module configured to recognize the received phoneme according to a comparison of the first distance with the second distance. 16. The computing device of claim 15, wherein the module configured to generate the expanded received-signal vector further: receives the analog acoustic signal; converts the analog acoustic signal into a digital signal; determines the received-signal vector as a time-frequency representation of the received digital signal; divides the received-signal vector into received-signal segments; assigns each received-signal segment into a plurality of received-signal parameters; and expands each received-signal segment and plurality of received-signal parameters into an expanded received-signal vector. 17. The computing device of claim 15, wherein the module configured to transform the expanded received-signal vector into an orthogonal form further: sets [yk]=[zk] ΛVt, where yk is a kth acoustic vector for a corresponding received phoneme, zk is the corresponding orthogonal vector and Λ and V are diagonal and unitary matrices, respectively. 18. The computing device of claim 15, wherein the module configured to transform the expanded stored phoneme vector into an orthogonal form uses singular-value decomposition and wherein transforming the expanded received-signal vector into an orthogonal form using singular-value decomposition further conforms the stored-phoneme vector into the hypersphere. 19. The computing device of claim 18, wherein determining a distance associated with the orthogonal form of the expanded received-signal vector and each orthogonal form of the expanded stored-phoneme vectors further comprises: comparing a distance from the center of the hypersphere of the orthogonal form of the expanded received-signal vector with a distance from the center of the hypersphere for each orthogonal form of the expanded stored-phoneme vector. 20. The computing device of claim 19, wherein determining a distance associated with the orthogonal form of the expanded received-signal vector and each orthogonal form of the expanded stored-phoneme vectors further comprises: determining a difference between the distance from the center of the hypersphere of the orthogonal form of the expanded received-signal vector and the distance from the center of the hypersphere for each orthogonal form of the expanded stored-phoneme vectors, wherein the expanded stored-phoneme vectors associated with m-shortest differences between the distance from the center of the hypersphere of the orthogonal form of the expanded received-signal vector and the distance from the center of the hypersphere for each orthogonal form of the expanded stored-phoneme vectors are recognized as most likely to be associated with the received phoneme.

이 특허에 인용된 특허 (22)

Baji Toru (Burlingame CA) Noguchi Kouki (Kokubunji CA JPX) Nakagawa Tetsuya (Millbrae CA) Tonomura Motonobu (Kodaira JPX) Akimoto Hajime (Mobara JPX) Masuhara Toshiaki (Tokyo JPX), Apparatus including a pair of neural networks having disparate functions cooperating to perform instruction recognition.
상세보기
Baji Toru (Burlingame CA) Noguchi Kouki (Kokubunji CA JPX) Nakagawa Tetsuya (Millbrae CA) Tonomura Motonobu (Kodaira JPX) Akimoto Hajime (Mobara JPX) Masuhara Toshiaki (Tokyo JPX), Customized personal terminal device.
상세보기
Prasad K. Venkatesh (Cupertino CA) Stork David G. (Stanford CA), Facial feature extraction method and apparatus for a neural network acoustic and visual speech recognition system.
상세보기
Aldersberg Shabtai (Ramat Gan ILX), Fast search method for vector quantizer communication and pattern recognition systems.
상세보기
Nussbaum Paul A., Method and apparatus for developing a neural network for phoneme recognition.
상세보기
Tsuboka Eiichi (Neyagawa JPX), Method and apparatus for pattern recognition employing the Hidden Markov Model.
상세보기
Casey Michael A., Method for extracting features from a mixture of signals.
상세보기
Beigi Homayoon S. M. ; Maes Stephane H. ; Sorensen Jeffrey S., Method for measuring distance between collections of distributions.
상세보기
Wang Shay-Ping T. (Long Grove IL), Method of training neural networks used for speech recognition.
상세보기
Kuhn Michael H. (Hamburg DEX) Tomaschewski Horst (Stuvenborn DEX), Method of verifying a speaker.
상세보기
Stork David G. (Stanford CA) Wolff Gregory J. (Mountain View CA), Neural network acoustic and visual speech recognition system training method and apparatus.
상세보기
Inazumi Mitsuhiro (Suwa JPX), Neural network speech recognition apparatus recognizing the frequency of successively input identical speech data sequen.
상세보기
Trompf Michael (Hemmingen DEX), Noise reduction for speech recognition.
상세보기
Watari Masao (Tokyo JPX) Chiba Seibi (Tokyo JPX), Pattern distance calculating equipment.
상세보기
Hattori Hiroaki (Tokyo JPX), Phoneme recognition utilizing relative positions of reference phoneme patterns and input vectors in a feature space.
상세보기
Neely William Shields, Preprocessor for automatic speech recognition system.
상세보기
Albesano Dario (Pianezza ITX) Gemello Roberto (Turin ITX) Mana Franco (Turin ITX), Speaker independent isolated word recognition system using neural networks.
상세보기
Campbell William Michael ; Kleider John Eric ; Broun Charles Conway ; Gifford Carl Steven ; Assaleh Khaled, Speaker independent speech recognition system and method.
상세보기
Lennig Matthew (Montreal CAX) Mermelstein Paul (Montreal CAX) Gupta Vishwa N. (Brossard CAX), Speech recognition.
상세보기
Yoshida Kazunaga (Tokyo JPX) Watanabe Takao (Tokyo JPX), Speech recognition apparatus of speaker adaptation type.
상세보기
Chung Ho-sun (Taegu KRX) Lee Soo-yong (Taegu KRX), Speech recognition system utilizing a neural network.
상세보기
Suzuki ; Matsumi ; Ichikawa ; Yoichi, Voice recognition system using locus of centroid of vocal frequency spectra.
상세보기

이 특허를 인용한 특허 (1)

Tian,Jilei; Nurminen,Jani K.; Popa,Victor, Method, apparatus, mobile terminal and computer program product for providing efficient evaluation of feature transformation.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

System and method of pattern recognition in very high-dimensional space 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (22)

이 특허를 인용한 특허 (1)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

System and method of pattern recognition in very high-dimensional space 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (22)

이 특허를 인용한 특허 (1)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트