Method and apparatus for quantizing pitch, amplitude, phase and linear spectrum of voiced speech
원문보기
IPC분류정보
국가/구분
United States(US) Patent
등록
국제특허분류(IPC7판)
G10L-019/14
G10L-019/00
출원번호
US-0897746
(2004-07-22)
등록번호
US-7426466
(2008-09-16)
발명자
/ 주소
Ananthapadmanabhan,Arasanipalai K.
Manjunath,Sharath
Huang,Pengjun
Choy,Eddie Lun Tik
DeJaco,Andrew P.
출원인 / 주소
QUALCOMM Incorporated
대리인 / 주소
Macek,Kyong
인용정보
피인용 횟수 :
25인용 특허 :
22
초록▼
A method and apparatus for predictively quantizing voiced speech includes a parameter generator and a quantizer. The parameter generator is configured to extract parameters from frames of predictive speech such as voiced speech, and to transform the extracted information to a frequency-domain repres
A method and apparatus for predictively quantizing voiced speech includes a parameter generator and a quantizer. The parameter generator is configured to extract parameters from frames of predictive speech such as voiced speech, and to transform the extracted information to a frequency-domain representation. The quantizer is configured to subtract a weighted sum of the parameters for previous frames from the parameter for the current frame. The quantizer is configured to quantize the difference value. A prototype extractor may be added to first extract a pitch period prototype to be processed by the parameter generator.
대표청구항▼
What is claimed is: 1. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising: a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively q
What is claimed is: 1. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising: a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value δLm, based on a formula: description="In-line Formulae" end="lead"δLm=Lm-ηm 1Lm1-ηm 2Lm2-. . .-ηmNLmN, description="In-line Formulae" end="tail" wherein the values Lm1, Lm2 . . . , LmN are the pitch lags for frames m1, m2, . . . , mN, respectively and the values ηm1, ηm2, . . . , ηmN are weights corresponding to frames m1, m2, . . . , mN, respectively. 2. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising: a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δAm) that is described by a formula: description="In-line Formulae" end="lead"δAm=Am-αm 1TAm1-α m2TAm2 -. . .-αmNTAmN ,description="In-line Formulae" end="tail" wherein the values Am1, Am2 . . . , AmN are a subset of the amplitude vector for frames m1, m2, . . . , mN, respectively, and the values αm1T, αm2T, . . . , αmNT are the transposes of corresponding weight vectors. 3. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising: a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula: description="In-line Formulae" end="lead"φm=φ'm-1, description="In-line Formulae" end="tail" wherein φ'm-1 represent the phases of an extracted prototype. 4. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising: a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (TMn) that is described by a formula: wherein LMn refers to an n-dimensional linear spectral information vector for frame M, the values {��M-1n, ��M-2n, . . . , ��M-Pn; n=0, 1, . . . , N-1} are the contributions of linear spectral information parameters of a number of frames, P, immediately prior to frame M, and the values {β1n, β2n, . . . , βPn; n=0, 1, . . . , N-1} are respective weights such that {β0n+β1n+, . . . , +βPn=1; n=0, 1 , . . . , N-1}. 5. A method for forming a set of quantized speech frame parameters, comprising: quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δLm , based on a formula: description="In-line Formulae" end="lead"δLm=Lm-ηm 1Lm1-ηm 2Lm2-. . .-ηmNLmN, description="In-line Formulae" end="tail" wherein the values Lm1, Lm2 . . . , LmN are the pitch lags for frames m1, m2, . . . , mN, respectively and the values ηm1, ηm2, . . . , ηmN are weights corresponding to frames m1, m2, . . . , mN, respectively. 6. A method for forming a set of quantized speech frame parameters, comprising: quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δAm) that is described by a formula: description="In-line Formulae" end="lead"δAm=Am-αm 1TAm1-α m2TAm2 -. . .-αmNTAmN ,description="In-line Formulae" end="tail" wherein the values Am1, Am2 . . . , AmN are a subset of the amplitude vector for frames m1, m2, . . . , mN, respectively, and the values αm1T, αm2T, . . . , αmNT are the transposes of corresponding weight vectors. 7. A method for forming a set of quantized speech frame parameters, comprising: quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on-a formula: description="In-line Formulae" end="lead"φm=φ'm-1, description="In-line Formulae" end="tail" wherein φ'm-1 represent the phases of an extracted prototype. 8. A method for forming a set of quantized speech frame parameters, comprising: quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (TMn) that is described by a formula: wherein LMn refers to an n-dimensional linear spectral information vector for frame M, the values {��M-1n, ��M-2n, . . . , ��M-Pn; n=0, 1, . . . , N-1} are the contributions of linear spectral information parameters of a number of frames, P, immediately prior to frame M, and the values {β1n, β2n, . . . , βPn; n=0, 1, . . . , N-1} are respective weights such that {β0n+β1n+, . . . , +βPn=1; n=0, 1 , . . . , N-1}. 9. A method for forming a set of quantized speech frame parameters, comprising: quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, further comprising extracting the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames. 10. A method for forming a set of quantized speech frame parameters, comprising: quantizing a pitch lag value; quantizing a target error vector of amplitude components; quantizing phase values; and quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, further comprising transmitting the set of quantized speech frame parameters across a wireless communication channel. 11. An apparatus comprising: means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and means for transmitting a packet of the quantized error vectors across a wireless communication channel. 12. An apparatus comprising: means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; and means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δLm, based on formula: description="In-line Formulae" end="lead"δLm=Lm-ηm 1Lm1-ηm 2Lm2-. . .-ηmNLmN, description="In-line Formulae" end="tail" wherein the values Lm1, Lm2 . . . , LmN are the pitch lags for frames m1, m2, . . . , mN, respectively and the values ηm1, ηm2. . . , ηmN are weights corresponding to frames m1, m2, . . . , mN, respectively. 13. An apparatus comprising: means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; and means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components,the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δAm) that is described by a formula: description="In-line Formulae" end="lead"δAm=Am-αm 1TAm1-α m2TAm2 -. . .-αmNTAmN ,description="In-line Formulae" end="tail" wherein the values Am1, Am2 . . . , AmN are a subset of the amplitude vector for frames m1, m2, . . . , mN, respectively, and the values αm1T, αm2T, . . . , αmNT are the transposes of corresponding weight vectors. 14. An apparatus comprising: means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; and means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula: description="In-line Formulae" end="lead"φm=φ'm-1, description="In-line Formulae" end="tail" wherein φ'm-1 represent the phases of an extracted prototype. 15. An apparatus comprising: means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; and means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (TMn ) that is described by a formula: wherein LMn refers to an n-dimensional linear spectral information vector for frame M, the values {��M-1n, ��M-2n, . . . , ��M-Pn; n=0, 1, . . . , N-1} are the contributions of linear spectral information parameters of a number of frames, P, immediately prior to frame M, and the values {β1n, β2n, . . . , βPn; n=0, 1, . . . , N-1} are respective weights such that {β0n+β1n+, . . . , +βPn=1; n=0, 1, . . . , N-1}. 16. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising: a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, amplitude components, phase values, and the linear spectral information components have been extracted from a voiced speech frame, the processor being further operable to execute a set of instructions stored in a storage medium to extract the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames. 17. A processor operable to execute a set of instructions stored in a storage medium to produce a set of quantized speech frame parameters, the parameters comprising: a predictively quantized pitch lag value; a quantized target error vector of amplitude components; predictively quantized phase values; and a quantized target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, the processor being further operable to execute a set of instructions stored in a storage medium to transmit the set of quantized speech frame parameters across a wireless communication channel. 18. An apparatus comprising: means for quantizing a pitch lag value; means for quantizing a target error vector of amplitude components; means for quantizing phase values; means for quantizing a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and means for extracting the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames. 19. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to: quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; and quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized pitch lag value is obtained from value δLm, based on a formula: description="In-line Formulae" end="lead"δLm=Lm-ηm 1Lm1-ηm 2Lm2-. . .-ηmNLmN, description="In-line Formulae" end="tail" wherein the values Lm 1, Lm 2 . . . , Lm N are the pitch lags for frames m1,m2, . . . mN, respectively and the values ηm1, ηm2. . . ,ηmN are weights corresponding to frames m1m2, . . . mN, respectively. 20. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to: quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; and quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of amplitude components is based on a target error vector of amplitude components (δAm) that is described by a formula: description="In-line Formulae" end="lead"δAm=Am-αm 1TAm1-α m2TAm 2-. . .-αmNTAmN ,description="In-line Formulae" end="tail" wherein the values Am1,Am2. . . , AmN are a subset of the amplitude vector for frames m1,m2, . . . , mN, respectively, and the values αm1T, αm1T, αm 2T, . . . , αmNT are the transposes of corresponding weight vectors. 21. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to: quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; and quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized phase values are based on a formula: description="In-line Formulae" end="lead"φm=φ'm-1 description="In-line Formulae" end="tail" wherein φ'm-1 represent the phases of an extracted prototype. 22. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to: quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; and quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame, wherein the quantized target error vector of linear spectral information components is based on a target error vector of linear spectral information components (Tb) that is described by a formula: wherein LMn refers to an n-dimensional linear spectral information vector for frame M, the values {��M-1n, ��M-2n, . . . , ��M-Pn;n=0, 1, . . , N-1} are contributions of linear spectral information parameters of a number of frames, P, immediately prior to frame M, and the values {β1n, β2n, . . , βPn; N=0,1, . . . , N-1} are respective weights such that {β0n-β1n+, . . . , +βPn=1; n=0, 1, . . , N-1}. 23. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to: quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and extract the pitch lag value, the amplitude components, the phase values, and the linear spectral information components from a plurality of voiced speech frames. 24. A computer-readable medium comprising instructions that upon execution in a processor cause the processor to: quantize a pitch lag value; quantize a target error vector of amplitude components; quantize phase values; quantize a target error vector of linear spectral information components, wherein the pitch lag value, the amplitude components, the phase values, and the linear spectral information components have been extracted from a voiced speech frame; and transmit the set of quantized speech frame parameters across a wireless communication channel.
연구과제 타임라인
LOADING...
LOADING...
LOADING...
LOADING...
LOADING...
이 특허에 인용된 특허 (22)
Kleider John ; Fette Bruce Alan ; Campbell William Michael ; Jaskie Cynthia Ann, Adaptive rate system and method for network communications.
McDonough John G. ; Chang Chienchung ; Singh Randeep ; Sakamaki Charles E. ; Tsai Ming-Chang ; Kantak Prashant, Application specific integrated circuit (ASIC) for performing rapid speech compression in a mobile telephone system.
Arasanipalai K. Ananthapadmanabhan ; Sharath Manjunath, Method and apparatus for interleaving line spectral information quantization methods in a speech coder.
Gilhousen Klein S. (San Diego CA) Jacobs Irwin M. (La Jolla CA) Weaver ; Jr. Lindsay A. (San Diego CA), Spread spectrum multiple access communication system using satellite or terrestrial repeaters.
Gilhousen Klein S. (San Diego CA) Jacobs Irwin M. (La Jolla CA) Padovani Roberto (San Diego CA) Weaver ; Jr. Lindsay A. (San Diego CA) Wheatley ; III Charles E. (Del Mar CA) Viterbi Andrew J. (La Jol, System and method for generating signal waveforms in a CDMA cellular telephone system.
Jacobs Paul E. (San Diego CA) Gardner William R. (San Diego CA) Lee Chong U. (San Diego CA) Gilhousen Klein S. (San Diego CA) Lam S. Katherine (San Diego CA) Tsai Ming-Chang (San Diego CA), Variable rate vocoder.
Preston, Dan A.; Preston, Joseph D.; Leyendecker, Robert; Eatherly, Wayne; Proctor, Rod L.; Smith, Phillip R., In-band signaling for data communications over digital wireless telecommunications networks.
Jelinek, Milan; Salami, Redwan, Method and device for efficient in-band dim-and-burst signaling and half-rate max operation in variable bit-rate wideband speech coding for CDMA wireless systems.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.