IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
US-0282809
(1999-03-31)
|
발명자
/ 주소 |
- Mené
- ndez-Pidal, Xavier
- Tanaka, Miyuki
- Chen, Ruxin
|
출원인 / 주소 |
- Sony Corporation, Sony Electronics Inc.
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
127 인용 특허 :
14 |
초록
▼
The noise suppressor utilizes statistical characteristics of the noise signal to attenuate amplitude values of the noisy speech signal that have a probability of containing noise. In one embodiment, the noise suppressor utilizes an attenuation function having a shape determined in part by a noise av
The noise suppressor utilizes statistical characteristics of the noise signal to attenuate amplitude values of the noisy speech signal that have a probability of containing noise. In one embodiment, the noise suppressor utilizes an attenuation function having a shape determined in part by a noise average and a noise standard deviation. In a further embodiment, the noise suppressor also utilizes an adaptive attenuation coefficient that depends on signal-to-noise conditions in the speech recognition system.
대표청구항
▼
1. An apparatus for noise attenuation in an electronic system, comprising:a noise suppressor configured to selectively attenuate additive noise in an electronic signal, said electronic signal being a noisy speech signal that includes a noise signal combined with a speech signal, said noise suppresso
1. An apparatus for noise attenuation in an electronic system, comprising:a noise suppressor configured to selectively attenuate additive noise in an electronic signal, said electronic signal being a noisy speech signal that includes a noise signal combined with a speech signal, said noise suppressor selectively attenuating said noise signal by utilizing statistical characteristics of amplitude energy values of said noise signal, said statistical characteristics of said amplitude energy values of said noise signal include a noise average and a noise standard deviation, said noise suppressor generating an attenuated noisy speech signal according to a formula: where Yat k is said attenuated noisy speech signal for a frequency k, Y k is said noisy speech signal for said frequency k, μ k is said noise average for said frequency k, σ k is said noise standard deviation for said frequency k, α is a overestimation coefficient, and A is an attenuation coefficient; anda processor coupled to said electronic system to control said noise suppressor. 2. The apparatus of claim 1, wherein said electronic includes a speech recognition system. 3. The apparatus of claim 2, wherein said speech recognition system is implemented in a motor vehicle. 4. The apparatus of claim 1, wherein said noise suppressor selectively attenuates said noise signal using an attenuation function that varies from a maximum attenuation to a minimum attenuation in a manner inverse to a probability density curve of said noise signal. 5. The apparatus of claim 1, wherein said attenuation coefficient includes an adaptive attenuation coefficient that is dependent on a frequency and a signal-to-noise ratio of said noisy speech signal. 6. The apparatus of claim 1, wherein said attenuation coefficient is replaced by an adaptive attenuation coefficient determined according to a formula:where A k (t) is said adaptive attenuation coefficient for a frequency index k at a frame t, A is said attenuation coefficient, α is said overestimation coefficient, μ k (t) is said noise average for frequency index k at frame t, Sp k (t) is a noisy speech average for frequency index k at frame t, μ k (prev) is a noise average for a noise period immediately previous to a current utterance, and Sp k (prev) is a noisy speech average for an utterance immediately previous to a current noise period. 7. The apparatus of claim 6, wherein said noise suppressor calculates said noisy speech average according to a formula: Sp k ( t )= ySp k ( t− 1)+(1− y ) Y k ( t )where Sp k (t) is said noisy speech average for frequency index k at frame t, Y k (t) is a noisy speech amplitude energy value for frequency index k at frame t, and y is a speech forgetting coefficient. 8. The apparatus of claim 1, wherein said noise suppressor determines a noise average and a noise standard deviation of said energy amplitude values of said noise signal, utilizes said noise average and said noise standard deviation to identify selected ones of said amplitude energy values of said noisy speech signal that have a probability of containing noise, and selectively attenuates said amplitude energy values of said noisy speech signal according to said probability. 9. The apparatus of claim 1, wherein said noise suppressor calculates said noise average according to a formula:where μ k is said noise average for a frequency index k, N k (t) is a noise energy amplitude value for frequency index k at a frame t for t equal to 1 through T, and T is a total number of frames in a noise period. 10. The apparatus of claim 9, wherein said noise suppressor calculates said noise standard deviation according to a formula:where σ k is said noise standard deviation for frequency index k, μ k is said noise average for frequency index k, N k (t) is said noise energy amplitude value for frequency index k at said frame t for t equal to 1 through T, and T i s said total number of frames in said noise period. 11. An apparatus for noise attenuation in an electronic system, comprising:a noise suppressor configured to selectively attenuate additive noise in an electronic signal, said electronic signal being a noisy speech signal that includes a noise signal combined with a speech signal, said noise suppressor selective attenuating said noise signal by utilizing statistical characteristics of amplitude energy values of said noise signal, said statistical characteristics of said amplitude energy values of said noise signal include a noise average and a noise standard deviation, said noise suppressor generating an attenuated noisy speech signal according to a formula: where Yat k is said attenuated noisy speech signal for a frequency k, Y k is said noisy speech signal for said frequency k, μ k is said noise average for said frequency k, σ k is said noise standard deviation for said frequency k, α v is an overestimation coefficient related to said noise standard deviation, and A is an attenuation coefficient; anda processor coupled to said electronic system to control said noise suppressor. 12. An apparatus for noise attenuation in an electronic system, comprising:a noise suppressor configured to selectively attenuate additive noise in an electronic signal, said electronic signal being a noisy speech signal that includes a noise signal combined with a speech signal, said noise suppressor selectively attenuating said noise signal by utilizing statistical characteristics of amplitude energy values of said noise signal, said statistical characteristics of said amplitude energy values of said noise signal include a noise average and a noise standard deviation, said noise suppressor calculating said noise average according to a formula: k ( t )=βμ k ( t− 1)+(1−β) N k ( t ) where μ k (t) is said noise average for a frequency index k at a frame t, N k (t) is a noise energy amplitude value for frequency index k at frame t, and β is a noise forgetting coefficient; anda processor coupled to said electronic system to control said noise suppressor. 13. The apparatus of claim 12, wherein said noise suppressor calculates a noise second moment according to a formula: S k ( t )=β S k ( t− 1)+(1−β) N k ( t ) N k ( t )where S k (t) is said noise second moment for frequency index k at frame t, N k (t) is said noise energy amplitude value for frequency index k at frame t, and β is said noise forgetting coefficient. 14. The apparatus of claim 13, wherein said noise suppressor calculates said noise standard deviation according to a formula: k ( t )={square root over ( S k ( t )−μ k ( t )μ k ( t ))}where σ k (t) is said noise standard deviation for frequency index k at frame t, S k (t) is said noise second moment for frequency index k at frame t, and μ k (t) is said noise average for frequency index k at frame t. 15. A method for noise attenuation in an electronic system, comprising the steps of:selectively attenuating additive noise in an electronic signal using a noise suppressor, said electronic signal being a noisy speech signal that includes a noise signal combined with a speech signal, said noise suppressor selectively attenuating said noise signal by utilizing statistical characteristics of amplitude energy values of said noise signal, said statistical characteristics of said amplitude energy values of said noise signal include a noise average and a noise standard deviation, said noise suppressor generating an attenuated noisy speech signal according to a formula: where Yat k said attenuated noisy speech for signal for a frequency k, Y k is said noisy speech signal for said frequency k, μ k is said noise average for said frequency k, σ k is said noise standard deviation for said frequency k, α is a overestimation coefficient, and A is an attenuation coefficient; andcontrolling said noise suppressor with a processor coupled to said electronic system. 16. The method of claim 15, wherein said electronic includes a speech recognition system. 17. The method of claim 16, wherein said speech recognition system is implemented in a motor vehicle. 18. The method of claim 15, wherein said noise suppressor selectively attenuates said noise signal using an attenuation function that varies from a maximum attenuation to a minimum attenuation in a manner inverse to a probability density curve of said noise signal. 19. The method of claim 15, wherein said attenuation coefficient includes an adaptive attenuation coefficient that is dependent on a frequency and a signal-to-noise ratio of said noisy speech signal. 20. The method of claim 15, wherein said attenuation coefficient is replaced by an adaptive attenuation coefficient determined according to a formula:where A k (t) is said adaptive attenuation coefficient for a frequency index k at a frame t, A is said attenuation coefficient, α is said overestimation coefficient, μ k (t) is said noise average for frequency index k at frame t, Sp k (t) is a noisy speech average for frequency index k at frame t, μ k (prev) is a noise average for a noise period immediately previous to a current utterance, and Sp k (prev) is a noisy speech average for an utterance immediately previous to a current noise period. 21. The method of claim 20, wherein said noise suppressor calculates said noisy speech average according to a formula: Sp k ( t )=ySp k ( t− 1)+(1− y ) Y k ( t )where Sp k (t) is said noisy speech average for frequency index k at frame t, Y k (t) is a noisy speech amplitude energy value for frequency index k at frame t, and y is a speech forgetting coefficient. 22. The method of claim 15, wherein said noise suppressor determines a noise average and a noise standard deviation of said energy amplitude values of said noise signal, utilizes said noise average and said noise standard deviation to identify selected ones of said amplitude energy values of said noisy speech signal that have a probability of containing noise, and selectively attenuates said amplitude energy values of said noisy speech signal according to said probability. 23. The method of claim 15, wherein said noise suppressor calculates said noise average according to a formula:where μ k is said noise average for a frequency index k, N k (t) is a noise energy amplitude value for frequency index k at a frame t for t equal to 1 through T, and T is a total number of frames in a noise period. 24. The method of claim 23, wherein said noise suppressor calculates said noise standard deviation according to a formula:where σ k is said noise standard deviation for frequency index k, μ k is said noise average for frequency index k, N k (t) is said noise energy amplitude value for frequency index k at said frame t for t equal to 1 through T, and T is said total number of frames in said noise period. 25. The method of claim 15, further comprising the step of generating amplitude energy values of said noisy speech signal using a Fast Fourier transformer. 26. The method of claim 25, further comprising the steps of providing attenuated noisy speech amplitude energy values to a filter bank, and generating channel energies using said filter bank. 27. The method of claim 26, further comprising the step of converting said channel energies into logarithmic channel energies using a logarithmic compressor. 28. The method of claim 27, further comprising the step of converting said logarithmic channel energies into corresponding static features using a frequency cosine transformer. 29. The method of claim 28, further comprising the step of providing said corresponding static features to a normalizer, a first cosine transformer, and a second cosine transformer. 30. The method of c laim 29, further comprising the steps of converting said corresponding static features into delta features using said first cosine transformer, converting said corresponding static features into delta-delta features using said second cosine transformer, and providing said delta features and said delta-delta features to said normalizer. 31. The method of claim 30, further comprising the step of normalizing said static features, said delta features, and said delta-delta features using said normalizer to produce normalized static features, normalized delta features, and normalized delta-delta features. 32. The method of claim 31, further comprising the step of analyzing said normalized static features, said normalized delta features, and said normalized delta-delta features using a recognizer to produce a speech recognition result. 33. A method for noise attenuation in an electronic system, comprising the steps of:selectively attenuating additive noise in an electronic signal using a noise suppressor, said electronic signal being a noisy speech signal that includes a noise signal combined with a speech signal, said noise suppressor selectively attenuating said noise signal by utilizing statistical characteristics of amplitude energy values of said noise signal, said statistical characteristics of said amplitude energy values of said noise signal include a noise average and a noise standard deviation, said noise suppressor generating an attenuated noisy speech signal according to a formula: where Yat k is said attenuated noisy speech signal for a frequency k, Y k is said noisy speech signal for said frequency k, μ k is said noise average for said frequency k, σ k is said noise standard deviation for said frequency k, α v is an overestimation coefficient related to said noise standard deviation, and A is an attenuation coefficient; andcontrolling said noise suppressor with a processor coupled to said electronic system. 34. A method for noise attenuation in an electronic system, comprising the steps of:selectively attenuating additive noise in an electronic signal using a noise suppressor, said electronic signal being a noisy speech signal that includes a noise signal combined with a speech signal, said noise suppressor selectively attenuating said noise signal by utilizing statistical characteristics of amplitude energy values of said noise signal, said statistical characteristics of said amplitude energy values of said noise signal include a noise average and a noise standard deviation, said noise suppressor calculating said noise average according to a formula: k ( t )=βμ k ( t− 1)+(1−β) N k ( t ) where μ k (t) is said noise average for a frequency index k at a frame t, N k (t) is a noise energy amplitude value for frequency index k at frame t, and β is a noise forgetting coefficient; andcontrolling said noise suppressor with a processor coupled to said electronic system. 35. The method of claim 34, wherein said noise suppressor calculates a noise second moment according to a formula: S k ( t )=β S k ( t− 1)+(1−β) N k ( t ) N k ( t )where S k (t) is said noise second moment for frequency index k at frame t, N k (t) is said noise energy amplitude value for frequency index k at frame t, and β is said noise forgetting coefficient. 36. The method of claim 35, wherein said noise suppressor calculates said noise standard deviation according to a formula: k ( t )={square root over ( S k ( t )−μ k ( t )μ k ( t ))}where σ k (t) is noise standard deviation for frequency index k at frame t, S k (t) is said noise second moment for frequency index k at frame t, and μ k (t) is said noise average for frequency index k at frame t. 37. An apparatus for noise attenuation in an electronic system, comprising:a noise suppressor configured to selectively attenuate addit ive noise in an electronic signal, said noise suppressor determining a noise average and a noise standard deviation of energy amplitude values of a noisy speech signal, said noise suppressor utilizing said noise average and said noise standard deviation to identify said amplitude energy values of said noisy speech signal that have a statistical probability of containing said additive noise, said noise suppressor selectively attenuating said amplitude energy values of said noisy speech signal according to said statistical probability, said noise suppressor calculating said noise average according to a formula: k ( t )=βμ k ( t− 1)+(1−β) N k ( t ) where μ k (t) is said noise average for a frequency index k at a frame t, N k (t) is a noise energy amplitude value for frequency index k at frame t, and β is a noise forgetting coefficient; anda processor coupled to said electronic system to control said noise suppressor. 38. The apparatus of claim 37, wherein said noise suppressor calculates a noise second moment according to a formula: S k ( t )=β S k ( t− 1)+(1−β) N k ( t ) N k ( t )where S k (t) is said noise second moment for frequency index k at frame t, N k (t) is said noise energy amplitude value for frequency index k at frame t, and β is said noise forgetting coefficient. 39. The apparatus of claim 38, wherein said noise suppressor calculates said noise standard deviation according to a formula: k ( t )={square root over ( S k ( t )−μ k ( t )μ k ( t ))}where σ k (t) is said noise standard deviation for frequency index k at frame t, S k (t) is noise second moment for frequency index k at frame t, and μ k (t) is said noise rage for frequency index k at frame t.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.