[특허]Methods and apparatus for buffering data for use in accordance with a speech recognition system

Methods and apparatus for buffering data for use in accordance with a speech recognition system 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G10L-015/04
출원번호	US-0056001 (2008-03-26)
등록번호	US-8781832 (2014-07-15)
발명자 / 주소	Comerford, Liam D. Frank, David Carl Lewis, Burn L. Rachevksy, Leonid Viswanathan, Mahesh
출원인 / 주소	Nuance Communications, Inc.
대리인 / 주소	Wolf, Greenfield & Sacks, P.C.
인용정보	피인용 횟수 : 2 인용 특허 : 30

초록 ▼

Techniques are disclosed for overcoming errors in speech recognition systems. For example, a technique for processing acoustic data in accordance with a speech recognition system comprises the following steps/operations. Acoustic data is obtained in association with the speech recognition system. The acoustic data is recorded using a combination of a first buffer area and a second buffer area, such that the recording of the acoustic data using the combination of the two buffer areas at least substantially minimizes one or more truncation errors associated with operation of the speech recognition system.

대표청구항 ▼

1. A method for processing acoustic data to reduce one or more truncation errors associated with operation of a speech recognition system, the method comprising acts of: continuously recording acoustic data in a circular buffer;when an indication that the speech recognition system is being addressed is detected, starting recording of acoustic data in a second buffer that is separate from the circular buffer;obtaining combined acoustic data at least in part by prepending first acoustic data recorded in the circular buffer to a beginning of second acoustic data recorded in the second buffer; andanalyzing the combined acoustic data, which comprises data from the circular buffer and data from the second buffer, to identify a likely speech endpoint in the combined acoustic data, wherein the act of analyzing comprises using a boundary between the first and second acoustic data as a reference location wherein the act of analyzing the combined acoustic data comprises an act of identifying, among one or more regions in the combined acoustic data likely to correspond to silence, a region of silence closest to the reference location. 2. The method of claim 1, wherein the act of obtaining combined acoustic data comprises an act of forming a composite buffer area comprising the first acoustic data prepended to the second acoustic data. 3. The method of claim 2, wherein: the composite buffer area contains, at a start of the first acoustic data prepended to the second acoustic data, oldest acoustic data in the circular buffer;acoustic data recorded in the circular buffer immediately before the indication that the speech recognition system is being addressed ends the first acoustic data; andin the composite buffer area, the acoustic data recorded in the circular buffer immediately before the indication that the speech recognition system is being addressed is contiguous in memory with acoustic data which is recorded in the second buffer immediately following the indication that the speech recognition system is being addressed. 4. The method of claim 2, wherein the act of analyzing the combined acoustic data comprises processing acoustic data in the composite buffer area to detect one or more features indicating silence. 5. The method of claim 4, wherein a location in the region of silence closest to the reference location is used as a location in the composite buffer area at which speech intended for the speech recognition system to process begins. 6. The method of claim 2, further comprising an act of decoding acoustic data in the composite buffer area into text. 7. The method of claim 2, wherein the act of forming the composite buffer area comprises: copying the first acoustic data recorded in the circular buffer to the composite buffer area. 8. The method of claim 1, wherein the region of silence closest to the reference location is in the first acoustic data if the indication that the speech recognition system is being addressed was given after speech started. 9. The method of claim 1, wherein the recording of acoustic data in the second buffer continues until an indication that the speech recognition system is no longer being addressed is detected and a feature indicating silence is detected in the acoustic data recorded in the second buffer. 10. The method of claim 1, further comprising: stopping recording of acoustic data in the circular buffer when recording of acoustic data is started in the second buffer;stopping recording of acoustic data in the second buffer and restarting recording of acoustic data in the circular buffer, when an indication that the speech recognition system is no longer being addressed is detected and a feature indicating silence is detected in the acoustic data recorded in the second buffer. 11. The method of claim 10, wherein the indication that the speech recognition system is being addressed comprises a microphone on event, and the indication that the speech recognition system is no longer being addressed comprises a microphone off event. 12. The method of claim 1, wherein the second buffer comprises a linear buffer. 13. The method of claim 1, wherein the circular buffer and the second buffer are at least part of a single storage data structure. 14. The method of claim 1, wherein the circular buffer and the second buffer are at least part of separate storage data structures. 15. Apparatus for processing acoustic data to reduce one or more truncation errors associated with operation of a speech recognition system, comprising: at least one memory comprising a circular buffer and a second buffer that is separate from the circular buffer; andat least one processor coupled to the memory and operative to: continuously record acoustic data in the circular buffer;when an indication that the speech recognition system is being addressed is detected, start recording of acoustic data in a second buffer;obtain combined acoustic data at least in part by prepending first acoustic data recorded in the circular buffer to a beginning of second acoustic data recorded in the second buffer; andanalyze the combined acoustic data, which comprises data from the circular buffer and data from the second buffer, to identify a likely speech endpoint in the combined acoustic data, wherein the act of analyzing comprises using a boundary between the first and second acoustic data as a reference location wherein the at least one processor is further operative to analyze the combined acoustic data at least in part by identifying, among one or more regions in the combined acoustic data likely to correspond to silence, a region of silence closest to the reference location. 16. The apparatus of claim 15, wherein prepending the first acoustic data comprises copying the acoustic data recorded in the circular buffer to a composite buffer area such that the composite buffer area comprises the first acoustic data prepended to the second acoustic data. 17. The apparatus of claim 15, wherein the region of silence closest to the reference location is in the first acoustic data if the indication that the speech recognition system is being addressed was given after speech started. 18. The apparatus of claim 15, wherein the at least one processor is further operative to: stop recording of acoustic data in the circular buffer when recording of acoustic data is started in the second buffer; andstop recording of acoustic data in the second buffer and restart recording of acoustic data in the circular buffer, when an indication that the speech recognition system is no longer being addressed is detected and a feature indicating silence is detected in the acoustic data recorded in the second buffer. 19. At least one article of manufacture for use in processing acoustic data to reduce one or more truncation errors associated with operation of a speech recognition system, comprising at least one machine readable medium having encoded thereon one or more programs which when executed implement acts of: continuously recording acoustic data in a circular buffer;when an indication that the speech recognition system is being addressed is detected, starting recording of acoustic data in a second buffer that is separate from the circular buffer;obtaining combined acoustic data at least in part by prepending first acoustic data recorded in the circular buffer to a beginning of second acoustic data recorded in the second buffer; andanalyzing the combined acoustic data, which comprises data from the circular buffer and data from the second buffer, to identify a likely speech endpoint in the combined acoustic data, wherein the act of analyzing comprises using a boundary between the first and second acoustic data as a reference location wherein the act of analyzing the combined acoustic data comprises an act of identifying, among one or more regions in the combined acoustic data likely to correspond to silence, a region of silence closest to the reference location. 20. The at least one article of manufacture of claim 19, wherein prepending the first acoustic data comprises copying the acoustic data recorded in the circular buffer to a composite buffer area such that the composite buffer area comprises the first acoustic data prepended to the second acoustic data. 21. The at least one article of manufacture of claim 19, wherein the one or more programs further implement: stopping recording of acoustic data in the circular buffer when recording of acoustic data is started in the second buffer; andstopping recording of acoustic data in the second buffer and restarting recording of acoustic data in the circular buffer, when an indication that the speech recognition system is no longer being addressed is detected and a feature indicating silence is detected in the acoustic data recorded in the second buffer. 22. A method for processing acoustic data in accordance with a speech recognition system, the method comprising acts of: recording acoustic data in at least one recording medium;detecting, at a first time, a user-generated input event instructing the speech recognition system to start speech recognition processing, the first time corresponding to a first location of the recorded acoustic data recorded in the at least one recording medium;searching in the recorded acoustic data to identify a silence region having the shortest distance, among all silence regions in the recorded acoustic data, relative to the first location in the recorded acoustic data corresponding to the first time at which the user-generated input event was detected; andidentifying a location in the identified silence region as a start location for speech recognition processing of at least a portion of the recorded acoustic data, wherein: if the recorded acoustic data is such that the identified silence region entirely follows the first location, the start location for speech recognition processing follows the first location; andif the recorded acoustic data is such that the identified silence region entirely precedes the first location, the start location for speech recognition processing precedes the first location. 23. The method of claim 22, further comprising: detecting, at a second time later than the first time, an indication to stop speech recognition processing, the second time corresponding to a second location of the recorded acoustic data;continuing to record acoustic data after the second time; andperforming speech recognition processing on at least a portion of the recorded acoustic data recorded after the second time. 24. The method of claim 23, further comprising: searching for acoustic data representing silence in the acoustic data recorded after the second time;identifying a third location having acoustic data representing silence; andperforming speech recognition processing on the recorded acoustic data between the second and third locations. 25. A system for processing acoustic data in accordance with a speech recognition system, the system comprising: at least one memory for storing executable instructions;at least one processor programmed by the executable instructions to; record acoustic data in at least one recording medium;detect, at a first time, a user-generated input event instructing the speech recognition system to start speech recognition processing, the first time corresponding to a first location of the recorded acoustic data recorded in the at least one recording medium;search in the recorded acoustic data to identify a silence region having the shortest distance, among all silence regions in the recorded acoustic data, relative to the first location in the recorded acoustic data corresponding to the first time at which the user-generated input event was detected; andidentify a location in the identified silence region as a start location for speech recognition processing of at least a portion of the recorded acoustic data, wherein:if the recorded acoustic data is such that the identified silence region entirely follows the first location, the start location for speech recognition processing follows the first location; andif the recorded acoustic data is such that the identified silence region entirely precedes the first location, the start location for speech recognition processing precedes the first location. 26. The system of claim 25, wherein the at least one processor is further programmed to: detect, at a second time later than the first time, an indication to stop speech recognition processing, the second time corresponding to a second location of the recorded acoustic data;continue to record acoustic data after the second time; andperform speech recognition processing on at least a portion of the recorded acoustic data recorded after the second time. 27. The system of claim 26, wherein the at least one processor is further programmed to: search for acoustic data representing silence in the acoustic data recorded after the second time;identify a third location having acoustic data representing silence; andperform speech recognition processing on the recorded acoustic data between the second and third locations. 28. At least one computer readable memory encoded with instructions that, when executed, perform a method for processing acoustic data in accordance with a speech recognition system, the method comprising acts of: recording acoustic data in at least one recording medium;detecting, at a first time, a user-generated input event instructing the speech recognition system to start speech recognition processing, the first time corresponding to a first location of the recorded acoustic data recorded in the at least one recording medium;searching in the recorded acoustic data to identify a silence region having the shortest distance, among all silence regions in the recorded acoustic data, relative to the first location in the recorded acoustic data corresponding to the first time at which the user-generated input event was detected; andidentifying a location in the identified silence region as a start location for speech recognition processing of at least a portion of the recorded acoustic data, wherein: if the recorded acoustic data is such that the identified silence region entirely follows the first location, the start location for speech recognition processing follows the first location; andif the recorded acoustic data is such that the identified silence region entirely precedes the first location, the start location for speech recognition processing precedes the first location. 29. The at least one computer readable memory of claim 28, wherein the method further comprises: detecting, at a second time later than the first time, an indication to stop speech recognition processing, the second time corresponding to a second location of the recorded acoustic data;continuing to record acoustic data after the second time; andperforming speech recognition processing on at least a portion of the recorded acoustic data recorded after the second time. 30. The at least one computer readable memory of claim 29, wherein the method further comprises: searching for acoustic data representing silence in the acoustic data recorded after the second time;identifying a third location having acoustic data representing silence; andperforming speech recognition processing on the recorded acoustic data between the second and third locations.

이 특허에 인용된 특허 (30)

Jiang,Hao; Zhang,Hongjiang, Audio segmentation and classification.
상세보기
Yoda,Shoutarou, Control using multiple speech receptors in an in-vehicle speech recognition system.
상세보기
Smith,David C., Device for and method of detecting voice activity.
상세보기
Walker Mark R. ; Kidder Jeffrey ; Keith Michael, Encoding audio signals using precomputed silence.
상세보기
Nakatoh Yoshihisa (Neyagawa JPX) Norimatsu Takeshi (Kadoma JPX), Method and an apparatus for speech detection for determining whether an input signal is speech or nonspeech.
상세보기
Bi Ning ; Chang Chienchung ; Dejaco Andrew P., Method and apparatus for accurate endpointing of speech in the presence of noise.
상세보기
Benyassine Adil ; Shlomot Eyal, Method and apparatus for generating frame voicing decisions of an incoming speech signal.
상세보기
Gersho Allen ; Shlomot Eyal ; Cuperman Vladimir ; Li Chunyan, Method and apparatus for hybrid coding of speech at 4kbps.
상세보기
Lee, Chin-Hui; Li, Qi P.; Zheng, Jinsong; Zhou, Qiru, Method and apparatus for performing real-time endpoint detection in automatic speech recognition.
상세보기
Nishiguchi Masayuki,JPX ; Matsumoto Jun,JPX, Method and device for discriminating voiced and unvoiced sounds.
상세보기
Wu Duanpei ; Tanaka Miyuki ; Chen Ruxin ; Olorenshaw Lex, Method for implementing a speech recognition system to determine speech endpoints during conditions with background noise.
상세보기
Mumolo Enzo (Pomerzia ITX), Method of and arrangement for distinguishing between voiced and unvoiced speech elements.
상세보기
Comerford, Liam D.; Frank, David Carl; Lewis, Burn L.; Rachevksy, Leonid; Viswanathan, Mahesh, Methods and apparatus for buffering data for use in accordance with a speech recognition system.
상세보기
Chu Chung Cheung,CAX ; Rabipour Rafi,CAX, Methods and apparatus for distinguishing stationary signals from non-stationary signals.
상세보기
Lennig, Matthew, Prosody based endpoint detection.
상세보기
Doddington George R. (Richardson TX), Speech analysis/synthesis system with silence suppression.
상세보기
Hayata Toshihiro,JPX, Speech coding apparatus.
상세보기
Nandkumar Srinivas ; Swaminathan Kumar, Speech mode based multi-stage vector quantizer.
상세보기
Taylor Matthew Whiting ; Smallwood Ralph Douglas, Speech presence detector based on sparse time-random signal samples.
상세보기
Rees, David Llewellyn, Speech processing apparatus and method.
상세보기
Takagi,Keizaburo, Speech recognition method and apparatus with noise adaptive standard pattern.
상세보기
Yamato,Toshitaka; Kitao,Hideki; Iwamoto,Shinichi; Iwata,Osamu; Nakamura,Masataka; Oomoto,Yoshinao, Speech section detection apparatus.
상세보기
Gao,Yang; Hutchinson,Alan J.; Pringle,Wallace C.; Yoon,Taeyoung; Zhao,He, Substituted biaryl amides as C5A receptor modulators.
상세보기
Bou Ghazale,Sahar E.; Asadi,Ayman O.; Assaleh,Khaled, System and method for a endpoint detection of speech for improved speech recognition in noisy environments.
상세보기
Slaney Malcolm, System and method for automatic classification of speech based upon affective content.
상세보기
Ali Syed S., System and method for interfacing a digital audio processor to a low-speed serially addressable storage device.
상세보기
Florencio,Dinei; Chou,Philip, System and method for real-time detection and preservation of speech onset in a signal.
상세보기
Cason David G., System for detecting voice activity.
상세보기
Freeman Daniel K. (Ipswich GB2) Boyd Ivan (Ipswich GB2), Voice activity detection.
상세보기
DesBlache Andre (Nice FRX) Galand Claude (Gagnes-sur-Mer FRX) Vermot-Gauchy Robert (Saint Paul FRX), Voice activity detection process and means for implementing said process.
상세보기

이 특허를 인용한 특허 (2)

Sadkin, Eric; Kaushik, Lakshmish; Gill, Jasjeet; Luz, Etay, Method and system for dynamic speech recognition and tracking of prewritten script.
상세보기
Gustavsson, Bengt Stefan; Hariharan, Magesh; Mitnala, Siva Pavan Kumar; Murray, John Michael; Shah, Peter Jivan, System and method of analyzing audio data samples associated with speech recognition.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Methods and apparatus for buffering data for use in accordance with a speech recognition system 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (30)

이 특허를 인용한 특허 (2)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Methods and apparatus for buffering data for use in accordance with a speech recognition system 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (30)

이 특허를 인용한 특허 (2)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트