[특허]Neural network acoustic and visual speech recognition system training method and apparatus

Neural network acoustic and visual speech recognition system training method and apparatus 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G01L-005/06 G01L-009/00
출원번호	US-0137318 (1993-10-14)
발명자 / 주소	Stork David G. (Stanford CA) Wolff Gregory J. (Mountain View CA)
출원인 / 주소	Ricoh Corporation (Menlo Park CA 02) Ricoh Company, Ltd. (Tokyo JPX 03)
인용정보	피인용 횟수 : 37 인용 특허 : 7

초록 ▼

The apparatus for the recognition of speech includes an acoustic preprocessor, a visual preprocessor, and a speech classifier that operates on the acoustic and visual preprocessed data. The acoustic preprocessor comprises a log mel spectrum analyzer that produces an equal mel bandwidth log power spectrum. The visual processor detects the motion of a set of fiducial markers on the speaker\s face and extracts a set of normalized distance vectors describing lip and mouth movement. The speech classifier uses a multilevel time-delay neural network operating on the preprocessed acoustic and visual data to form an output probability distribution that indicates the probability of each candidate utterance having been spoken, based on the acoustic and visual data. The training system includes the speech recognition apparatus and a control processor with an associated memory. Noisy acoustic input training data together with visual data is used to generate acoustic and visual feature training vectors for processing by the speech classifier. A control computer adjusts the synaptic weights of the speech classifier based upon the noisy input training data and exemplar output vectors for producing a robustly trained classifier based on the analogous visual counterpart of the Lombard effect.

대표청구항 ▼

A training system for a speech recognition system comprising: (a) a speech recognition system for recognizing utterances belonging to a pre-established set of allowable candidate utterances using acoustic speech signals and selected concomitant dynamic visual facial feature motion between selected facial features associated with acoustic speech generation, comprising, (i) an acoustic feature extraction apparatus for converting signals representative of dynamic acoustic speech into a corresponding dynamic acoustic feature vector set of signals, (ii) a dynamic visual feature extraction apparatus for converting signals representative of the selected concomitant dynamic facial feature motion associated with acoustic speech generation into a corresponding dynamic visual feature vector set of signals, and (iii) a time delay neural network classifying apparatus with an input-to-output transfer characteristic controlled by a set of adjustable synaptic weights for generating an output response vector representing a conditional probability distribution of the allowable candidate speech utterances by accepting and operating on a set of corresponding time-delayed dynamic acoustic and visual feature vector pairs that are respectively supplied by the acoustic and visual feature extraction apparatus to a set of inputs; and (b) a control system comprising a control processor and an associated memory coupled to the speech recognition system for initializing parameters, for controlling the speech recognition system, for storing acoustic and visual output exemplar vectors, for computing output errors, and for adjusting the time delay neural network classifying apparatus synaptic weights based on the computed errors in accordance with a prescribed training procedure.

이 특허에 인용된 특허 (7)

Beadles Robert L. (Durham NC), Audio visual speech recognition.
상세보기
Baji Toru (Burlingame CA) Noguchi Kouki (Kokubunji CA JPX) Nakagawa Tetsuya (Millbrae CA) Tonomura Motonobu (Kodaira JPX) Akimoto Hajime (Mobara JPX) Masuhara Toshiaki (Tokyo JPX), Customized personal terminal device.
상세보기
Petajan Eric D. (25 Cypress St. Millburn NJ 07041), Electronic facial tracking and detection system and method and apparatus for automated speech recognition.
상세보기
Roberts Jed (Cambridge MA) Baker James K. (West Newton MA) Porter Edward W. (Boston MA), Method for interactive speech recognition and training.
상세보기
Hopfield John J. (Pasadena CA) Tank David W. (Maplewood NJ), Neural computation by time concentration.
상세보기
Smith Allen R. (Shelton CT) Tan Chuan-Chieh (Orange CT) Slack Thomas B. (Oxford CT) Denenberg Jeffrey N. (Trumbull CT), Probabilistic learning element.
상세보기
Sakamoto Kenji (Nara JPX) Yamaguchi Kouichi (Tenri JPX), Recognition apparatus using articulation positions for recognizing a voice.
상세보기

이 특허를 인용한 특허 (37)

Velusamy, Kavitha; Chu, Wai C.; Gopalan, Ramya; Chhetri, Amit S., Acoustic echo cancellation using visual cues.
상세보기
Velusamy, Kavitha; Chu, Wai C.; Gopalan, Ramya; Chhetri, Amit S., Acoustic echo cancellation using visual cues.
상세보기
Tan, Bozhao, Acoustic sound signature detection based on sparse features.
상세보기
Burke, Paul M.; Yacoub, Sherif, Allocation of speech recognition tasks and combination of results thereof.
상세보기
Carey, Ryan Michael; Chan, Victor Hokkiu, Analog signal reconstruction and recognition via sub-threshold modulation.
상세보기
Cho, Jeong-Mi; Kim, Jeong-Su; Bang, Won-Chul; Kim, Nam-Hoon, Apparatus and method for predicting user's intention based on multimodal information.
상세보기
Deligne, Sabine; Neti, Chalapathy V.; Potamianos, Gerasimos, Audio-visual codebook dependent cepstral normalization.
상세보기
Deligne,Sabine; Neti,Chalapathy V.; Potamianos,Gerasimos, Audio-visual codebook dependent cepstral normalization.
상세보기
Marcheret, Etienne; Vopicka, Josef; Goel, Vaibhava, Audio-visual speech recognition with scattering operators.
상세보기
Marcheret, Etienne; Vopicka, Josef; Goel, Vaibhava, Audio-visual speech recognition with scattering operators.
상세보기
Morrison, Andrew R., Camera-assisted noise cancellation and speech recognition.
상세보기
Lahr,Roy J., Head-worn, trimodal device to increase transcription accuracy in a voice recognition system and to process unvocalized speech.
상세보기
Zhou, Dong; Hovden, Gunnar; Noble, Isaac S.; Ivanchenko, Volodymyr V.; Karakotsios, Kenneth M., Managing resource usage for task performance.
상세보기
Chen Tsuhan ; Rao Ram R., Method and apparatus for cross-modal predictive coding for talking head sequences.
상세보기
Geppert,Nicolas Andre; Sattler,J��rgen, Method and system for the processing and storing of voice information and corresponding timeline information.
상세보기
Geppert,Nicolas Andre; Sattler,J��rgen, Method and system for the processing of voice data and for the recognition of a language.
상세보기
Choo,Ki hyun; Kim,Jeong su; Lee,Jae won; Lee,Ki seung, Method of setting optimum-partitioned classified neural network and method and apparatus for automatic labeling using optimum-partitioned classified neural network.
상세보기
Peterson Richard John ; Russell Dale William ; Karaali Orhan ; Bliss Harry Martin, Method, device and system for noise-tolerant language understanding.
상세보기
Wagner Thomas,DEX ; Boebel Friedrich G.,FRX ; Bauer Norbert,DEX, Person identification based on movement information.
상세보기
Hart, Gregory M.; Bezos, Jeffrey P.; Kwee, Frances MHH; Brown, James Samuel, Relative position-inclusive device interfaces.
상세보기
Capless, Jonathan, Scrolling display of electronic program guide utilizing images of user lip movements.
상세보기
Colmenarez,Antonio; Kellner,Andreas, Speech activity detection using acoustic and facial characteristics in an automatic speech recognition system.
상세보기
Campbell William Michael, Speech classifier and method using delay elements.
상세보기
Harada Masaaki,JPX ; Takeuchi Shin,JPX ; Fukui Motofumi,JPX ; Shimizu Tadashi,JPX, Speech detection apparatus using specularly reflected light.
상세보기
Hart, Gregory M.; Freed, Ian W.; Zehr, Gregg Elliott; Bezos, Jeffrey P., Speech-inclusive device interfaces.
상세보기
Atal, Bishnu Saroop, System and method of pattern recognition in very high dimensional space.
상세보기
Atal,Bishnu Saroop, System and method of pattern recognition in very high-dimensional space.
상세보기
Atal,Bishnu Saroop, System and method of pattern recognition in very high-dimensional space.
상세보기
Atal,Bishnu Saroop, System and method of pattern recognition in very high-dimensional space.
상세보기
Thomas, David R., Telescopic reconstruction of facial features from a speech pattern.
상세보기
Gruenstein, Alexander H., Training multiple neural networks with different accuracy.
상세보기
Costello, Kevin Robert, User interface techniques for simulating three-dimensional depth.
상세보기
White, Marc, Using a physical phenomenon detector to control operation of a speech recognition engine.
상세보기
White, Marc, Using a physical phenomenon detector to control operation of a speech recognition engine.
상세보기
Bernd Girod DE, Video-assisted audio signal processing system and method.
상세보기
Girod,Bernd, Video-assisted audio signal processing system and method.
상세보기
Margolis, Jeffrey, Virtual object.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Neural network acoustic and visual speech recognition system training method and apparatus 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (7)

이 특허를 인용한 특허 (37)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Neural network acoustic and visual speech recognition system training method and apparatus 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (7)

이 특허를 인용한 특허 (37)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트