[특허]Method, medium, and system detecting speech using energy levels of speech frames

Method, medium, and system detecting speech using energy levels of speech frames 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G10L-015/04 G10L-015/02 G10L-015/00 G10L-021/00 G10L-025/78
출원번호	US-0882444 (2007-08-01)
등록번호	US-9009048 (2015-04-14)
우선권정보	KR-10-2006-0073386 (2006-08-03)
발명자 / 주소	Jang, Giljin Kim, Jeongsu Bridle, John S. Hunt, Melvyn J.
출원인 / 주소	Samsung Electronics Co., Ltd.
대리인 / 주소	Staas & Halsey LLP
인용정보	피인용 횟수 : 3 인용 특허 : 25

초록 ▼

A speech recognition method, medium, and system. The method includes detecting an energy change of each frame making up signals including speech and non-speech signals, and identifying a speech segment corresponding to frames that include only speech signals from among the frames based on the detect

대표청구항 ▼

1. A speech recognition method, comprising: detecting, using at least one processing device, energy changes between a plurality of frames distinguishing portions of a signal, each of the plurality of frames having time lengths less than a whole time length of the signal; andidentifying speech segments and/or non-speech segments from the plurality of frames based on the detected energy changes between the plurality of frames by assigning a predetermined weight to a segment in which an energy level of a respective frame is changed and when an energy difference exists between two neighboring frames. 2. The method of claim 1, further comprising classifying each of the plurality of frames according to respective energy levels based on predetermined criteria, wherein in the detecting of the energy changes between the plurality of frames, detection of the energy change is based on differences in the respective classified energy levels. 3. The method of claim 2, wherein the identifying of the speech segment and/or non-speech segments comprises: repeatedly performing processes of assigning the predetermined weight to a segment in which an energy level of a respective frame is changed and calculating weights for all respective segments; andidentifying a segment corresponding to a minimum weight, among the calculated weights, as being a speech segment,wherein the segment corresponding to the minimum weight has a lower energy level than the other speech segments. 4. The method of claim 2, wherein, in the classifying of the frames, frames are classified according to calculated energies of respective frames. 5. The method of claim 2, further comprising modifying a classified energy level of a frame by changing the classified energy level of the frame, wherein in the detecting the energy changes, a segment in which the classified energy level of the frame is changed is identified. 6. The method of claim 5, wherein the energy change includes a change between energy levels of neighboring frames and a change between an initial energy level of a frame and a changed energy level of the frame. 7. The method of claim 2, further comprising updating the predetermined criteria according to detected energies of the signal. 8. The method of claim 7, wherein frames are classified into three levels including high, medium, and low levels based on the detected energies. 9. The method of claim 1, further comprising combining the identified speech segments with other speech and/or non-speech segments of the signal. 10. The method of claim 1, wherein the non-speech segments include a burst noise which has a frequency characteristic that remarkably changes within a short period of time compared to the whole time length of the signal. 11. At least one non-transitory recording medium comprising computer readable code to control at least one processing element to implement a speech recognition method, comprising: detecting, using at least one processing device, energy changes between a plurality of frames distinguishing portions of a signal, each of the plurality of frames having time lengths less than a whole time length of the signal; andidentifying speech segments and/or non-speech segments from the plurality of frames based on the detected energy changes between the plurality of frames by assigning a predetermined weight to a segment in which an energy level of a respective frame is changed and when an energy difference exists between two neighboring frames. 12. A speech recognition system including at least one processing device, the system comprising: a change detector to detect, using the at least one processing device, energy changes between a plurality of frames distinguishing portions of a signal, each of the plurality of frames having lengths less than a whole time length of the signal; anda determiner to identify speech segments and/or non-speech segments from the plurality of frames based on the detected energy changes between the plurality of frames by assigning a predetermined weight to a segment in which an energy level of a respective frame is changed and when an energy difference exists between two neighboring frames. 13. The system of claim 12, further comprising an energy level classifier to classify each of the plurality of frames according to respective energy levels based on predetermined criteria,wherein the change detector detects a segment in which respective energies of each frame are changed based on the classified energy level. 14. The system of claim 13, further comprising: an energy calculator to calculate energies of each frame;an energy level updater to update the predetermined criteria according to the energies of each signal;wherein the energy level classifier classifies frames into three levels including high, medium, and low levels. 15. The system of claim 13, further comprising a generator to modify an energy level of a frame by changing the classified energy level of the frame, wherein the change detector detects a segment in which the classified energy level of the frame is changed. 16. The system of claim 12, wherein the determiner repeatedly performs processes of assigning the predetermined weight to a segment in which an energy level of a respective frame is changed and calculating weights for all respective segments in order to identify a segment corresponding to a minimum weight, among the calculated weights, as being a speech segment, wherein the segment corresponding to the minimum weight has a lower energy level than the other speech segments. 17. The system of claim 12, further comprising a combiner to combine the identified speech segment with other speech and/or non-speech segments of the signal. 18. A speech recognition system, comprising: an A/D converter to convert an analog input signal including speech and/or non-speech signals transmitted through an audio transducer into a digital input signal;a frame generator to generate a plurality of frames corresponding to the digital input signal;a phoneme detector to generate a phoneme sequence from the frames;a vocabulary recognition device to extract a phoneme sequence most similar to the phoneme detector generated phoneme sequence from a dictionary that stores reference phoneme sequences;a speech segment detection device including a determiner to detect energy changes between the frames distinguishing portions of the signal, each of the frames having time lengths less than a whole time length of the signal, and to identify a speech segment from the frames based on the detected energy changes between the frames by assigning a predetermined weight to a segment in which an energy level of a respective frame is changed and when an energy difference exists between two neighboring frames; anda phoneme sequence editor to edit the phoneme detector generated phoneme sequence based on information on speech segments provided from the speech segment detection device. 19. The system of claim 18, wherein the speech segment detection device combines identified speech segments with other speech and/or non-speech segments and outputs a result of the combination to the phoneme sequence editor. 20. The system of claim 18, wherein the phoneme sequence editor removes phoneme sequences, except phoneme sequences corresponding to speech segments, based on information on the identified speech segment.

이 특허에 인용된 특허 (25)

Graumann David L., Adaptive noise reduction technique for multi-point communication system.
상세보기
Wark,Timothy John, Audio segmentation with energy-weighted bandwidth bias.
상세보기
Russell Martin J. (Worcestershire GB3) Series Robert W. (Worcestershire GB3) Wallace Julie L. (Worcester GB3), Children\s speech training aid.
상세보기
Zinser, Jr.,Richard L.; Koch,Steven R., Compressed domain voice activity detector.
상세보기
Ariyoshi Takashi,JPX, Integrated endpoint detection for improved speech recognition method and system.
상세보기
Husain Mohammad Aamir,CAX ; Bhattacharya Bhaskar,CAX, Location and coding of unvoiced plosives in linear predictive coding of speech.
상세보기
Hamilton Chris A. (Montclair NJ), Method and apparatus for identifying speech in telephone signals.
상세보기
Ashley James P. (Naperville IL), Method and apparatus for suppressing noise in a communication system.
상세보기
Raman Vijay Rangan, Method and system for differentiating between speech and noise.
상세보기
Mumolo Enzo (Pomerzia ITX), Method of and arrangement for distinguishing between voiced and unvoiced speech elements.
상세보기
Das Amitava ; Manjunath Sharath, Multipulse interpolative coding of transition speech frames.
상세보기
Taleb,Anisse, Partial spectral loss concealment in transform codecs.
상세보기
Tzirkel-Hancock Eli,GBX, Pattern matching method, apparatus and computer readable memory medium for speech recognition using dynamic programming.
상세보기
Nishiguchi Masayuki,JPX ; Iijima Kazuyuki,JPX ; Matsumoto Jun,JPX ; Omori Shiro,JPX, Perceptual speech coding using prediction residuals, having harmonic magnitude codebook for voiced and waveform codebook.
상세보기
Gupta Vishwa N. (Brossard CAX) Lennig Matthew (Montreal CAX) Kenny Patrick J. (Montreal CAX) Toulson Christopher K. (Dollard des Ormeaux CAX), Phoneme based speech recognition.
상세보기
Dudemaine Martin,CAX ; Pelletier Claude,CAX, Selection of decoys for non-vocabulary utterances rejection.
상세보기
Albesano Dario (Pianezza ITX) Gemello Roberto (Turin ITX) Mana Franco (Turin ITX), Speaker independent isolated word recognition system using neural networks.
상세보기
Rajasekaran Periagaram K. (Richardson TX) Yoshino Toshiaki (Tokyo JPX), Speaker-independent word recognition method and system based upon zero-crossing rate and energy measurement of analog sp.
상세보기
Keiller, Robert Alexander, Speech processing apparatus and method.
상세보기
Klovstad John W. (Dorchester MA) Lee Chin-Hui (Cambridge MA) Ganesan Kalyan (Burlington MA), Speech recognition method having noise immunity.
상세보기
Nitta Tsuneo (Yokohama JPX), Speech recognition system utilizing both a long-term strategic and a short-term strategic scoring operation in a transit.
상세보기
Florencio,Dinei; Chou,Philip; He,Li Wei, System and method for providing high-quality stretching and compression of a digital audio signal.
상세보기
Hunt Melvyn (Fareham GBX), System for separating speech from background noise.
상세보기
Bahl Lalit Rai ; Gopalakrishnan Ponani ; Gopinath Ramesh Ambat ; Maes Stephane Herman ; Panmanabhan Mukund ; Polymenakos Lazaros, Transcription of speech data with segments from acoustically dissimilar environments.
상세보기
Jacobs Paul E. (San Diego CA) Gardner William R. (San Diego CA) Lee Chong U. (San Diego CA) Gilhousen Klein S. (San Diego CA) Lam S. Katherine (San Diego CA) Tsai Ming-Chang (San Diego CA), Variable rate vocoder.
상세보기

이 특허를 인용한 특허 (3)

Norair, John Peter, Method and apparatus for adaptive traffic management in a resource-constrained network.
상세보기
Norair, John Peter, Method and apparatus for low-power, long-range networking.
상세보기
Norair, John Peter; Burns, Patrick, Protective case for adding wireless functionality to a handheld electronic device.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Method, medium, and system detecting speech using energy levels of speech frames 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (25)

이 특허를 인용한 특허 (3)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Method, medium, and system detecting speech using energy levels of speech frames 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (25)

이 특허를 인용한 특허 (3)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트