[특허]Method of setting optimum-partitioned classified neural network and method and apparatus for automatic labeling using optimum-partitioned classified neural network

Method of setting optimum-partitioned classified neural network and method and apparatus for automatic labeling using optimum-partitioned classified neural network 원문보기

IPC분류정보
국가/구분	United States(US) Patent 등록
국제특허분류(IPC7판)	G10L-011/00 G10L-021/00 G10L-011/06 G10L-021/06 G10L-015/00 G10L-015/16 G10L-015/06 G10L-013/00 G06E-001/00 G06E-003/00 G06F-015/18 G06G-007/00 G06N-003/08 G06N-003/00
출원번호	US-0788301 (2004-03-01)
등록번호	US-7444282 (2008-10-28)
우선권정보	KR-10-2003-0012700(2003-02-28)
발명자 / 주소	Choo,Ki hyun Kim,Jeong su Lee,Jae won Lee,Ki seung
출원인 / 주소	Samsung Electronics Co., Ltd.
대리인 / 주소	Staas & Halsey LLP
인용정보	피인용 횟수 : 15 인용 특허 : 13

초록 ▼

A method of automatic labeling using an optimum-partitioned classified neural network includes searching for neural networks having minimum errors with respect to a number of L phoneme combinations from a number of K neural network combinations generated at an initial stage or updated, updating weights during learning of the K neural networks by K phoneme combination groups searched with the same neural networks, and composing an optimum-partitioned classified neural network combination using the K neural networks of which a total error sum has converged; and tuning a phoneme boundary of a first label file by using the phoneme combination group classification result and the optimum-partitioned classified neural network combination, and generating a final label file reflecting the tuning result.

대표청구항 ▼

What is claimed is: 1. A method of automatic labeling to tune a phoneme boundary of a first label file generated by performing automatic labeling of a manual label file, the method comprising: searching for neural networks having minimum errors with respect to a number of L phoneme combinations from a number of K neural network combinations generated at an initial stage or updated; updating weights during learning of the K neural networks by K phoneme combination groups searched with the same neural networks; composing an optimum-partitioned classified neural network combination using the K neural networks of which a total error sum has converged; tuning a phoneme boundary of a first label file using a phoneme combination group classification result and the optimum-partitioned classified neural network combination from the composing of the optimum-partitioned classified neural network combination; and generating a final label file reflecting the tuning result, wherein a phoneme boundary tuning field in the tuning of the phoneme boundary or the first label file and the generating a final label file reflecting the tuning result is set to a predetermined field of a duration time of left and right phonemes of the phoneme combination. 2. The method of claim 1, further comprising setting an output value of the neural network to 1 for the part applicable to a boundary between phonemes, setting the output value for the part not applicable to a boundary between phonemes to 0, and setting the output value for the part of 1 frame left or right apart from a phoneme boundary to 0.5. 3. The method of claim 1, further comprising setting the predetermined field to a length which divides the duration time of the left and right phonemes into three equal parts and segments one-third each to the left and right near each phonemic boundary of the first label file. 4. The method of claim 1, further comprising using a computer readable medium having recorded thereon a computer readable program code to tune the phoneme boundary of the first label file generated by performing automatic labeling of the manual label file. 5. An apparatus for automatic labeling using an optimum-partitioned classified neural network, comprising: a labeling unit to generate a first label file by performing automatic labeling for a manual label file; an optimum-partitioned classified neural network composing unit searching neural networks having minimum errors with respect to a number of L phoneme combinations from a number of K neural network combinations generated at an initial stage or updated, updating weights during learning of the K neural networks by K phoneme combination groups searched with the same neural networks, and composing an optimum-partitioned classified neural network combination using the K neural networks of which a total error sum has converged; and a phoneme boundary tuning unit tuning a phoneme boundary of the first label file by using a phoneme combination group classification result and the optimum-partitioned classified neural network combination supplied from the optimum-partitioned classified neural network composing unit, and generating a final label file reflecting the tuning result, wherein the phoneme boundary tuning field of the phoneme boundary tuning unit is set to a predetermined field of a duration time of left and right phonemes of the phoneme combination. 6. The apparatus of claim 5, wherein the optimum-partitioned classified neural network composing unit comprises: a training corpus storing input variables, wherein the input variables include acoustic feature variables, additional variables, and a manual label file; a minimum error classifying unit generating L phoneme combinations realized with names of left and right phonemes by using a phoneme boundary obtained from input variables and a manual label file stored in the training corpus, searching a neural network having minimum errors with respect to the L phoneme combinations from K neural network combinations generated or updated at an initial time, and classifying the L phoneme combinations into K phoneme combination groups searched with the same neural networks; and a re-training unit updating weights until individual errors of the neural networks have converged during learning with applicable learning data for the K neural networks by the K phoneme combination groups classified in the minimum error classifying unit and re-training the neural networks until a total error of the K neural networks, of which individual errors have converged, has converged. 7. The apparatus of claim 5, further comprising setting an output value of the neural network to 1 for the part applicable to a boundary between phonemes, setting the output value for the part not applicable to a boundary between phonemes to 0, and setting the output value for the part of 1 frame left or right apart from a phoneme boundary to 0.5. 8. The apparatus of claim 5, wherein the predetermined field is set to a length which divides the duration time of the left and right phonemes into three equal parts and segments one third each to the left and right near each phonemic boundary of the first label file. 9. An apparatus for automatic labeling using an optimum-partitioned classified neural network, comprising: a labeling unit to perform automatic labeling of a manual label file and generate a first label file; an optimum-partitioned classified neural network composing unit to receive input variables, segment phoneme combinations into partitions applicable to neural networks, and compose optimum-partitioned classified neural networks of Multi-Layer Perceptron-type from re-learned partitions; and a phoneme boundary tuning unit to tune a phoneme boundary of the first label file supplied from the labeling unit and to generate a final label file reflecting the tuning result, wherein the phoneme boundary tuning unit tunes the phoneme boundary by using the optimum-partitioned classified neural networks composed after completing learning in the optimum-partitioned classified neural network composing unit and judges the phoneme boundary according to whether an output of a neural network is 1 or 0 after applying the same input variable as the input variable used during the learning. 10. The apparatus of claim 9, wherein if there is a nonlinear clustering of neural networks of a Multi-Layer Perceptron-type, weights are determined by an iterative modification, wherein the iterative modification is performed by a back-propagation algorithm. 11. The apparatus of claim 9, wherein the optimum-partitioned classified neural network composing unit comprises: a training corpus to store input variables, wherein the input variables comprise acoustic feature variables, additional variables, and a manual label file; a minimum error classifying unit to generate phoneme combinations of names of left and right phonemes by using a phoneme boundary obtained from the input variables and manual label file stored in the training corpus, search an optimum neural network having a minimum error level relating to the phoneme combinations from neural network combinations of a Multi-Layer Perceptron-type, and classify the phoneme combinations into phoneme combination groups with the same neural networks; and a re-training unit to update weights of the neural networks by learning with applicable learning data for the neural networks as many as the predetermined number of iterations with respect to each of the phoneme combination groups classified in the minimum error classifying unit and converge a total error by adapting the updated weights to an applicable neural network in the neural network combinations. 12. A method of composing the optimum neural network combination minimizing a total error sum, comprising: preparing an initial neural network combination using input variables; searching for the optimum neural network having minimum errors with respect to the phoneme combinations from the initial neural network combination and classifying phoneme combinations with optimum neural networks; merging phoneme combinations with the same neural network and classifying the merged phoneme combinations into new partitions; updating each neural network and learning each neural network according to the partitions generated by the classifying and the merging; determining whether all neural networks are converged, wherein if all neural networks are converged, composing the neural network combination as an optimum-partitioned classified neural network combination, and if all neural networks are not converged, then re-learning the optimum neural network having minimum errors by repeating the classifying, the merging, and the updating operations until all neural networks are converged. 13. The method of claim 12, wherein the preparing the initial neural network combination further comprises setting learning data for neural network learning and setting a first and a second threshold value. 14. The method of claim 13, further comprising repeatedly performing the updating of each neural network via a procedure that calculates the error using an updated neural network parameter, wherein the updating ends when all neural networks are converged. 15. The method of claim 14, wherein all neural networks are converged when an error-changing rate is less than the first threshold value. 16. The method of claim 12, wherein the setting learning data for neural network learning and the preparing the initial neural network combination further comprises setting an iteration times index to 0, setting an initial error sum to infinity, and preparing a position value of the phoneme boundary obtained by manual labeling. 17. The method of claim 12, further comprising calculating the total error for the classified phoneme combinations with optimum neural networks, wherein the total error is a sum of square errors between a target output and an output obtained when inputting all learning data of the phoneme combination to the neural network. 18. The method of claim 12, further comprising calculating the total error for the new partition. 19. The method of claim 12, further comprising calculating a weight update value to update each neural network. 20. The method of claim 12, further comprising updating the neural networks by setting the learning gain with a small value and setting the first threshold value for convergence investigation with a comparatively small value. 21. The method of claim 12, further comprising setting an output value of the neural network to 1 for the part applicable to a boundary between phonemes, setting the output value for the part not applicable to a boundary between phonemes to 0, and setting the output value for the part of 1 frame left or right apart from a phoneme boundary to 0.5. 22. The method of claim 12, further comprising assigning all input variables an applicable phoneme combination by seeking a nearest phoneme boundary from positions of input variables and deciding which name of two phonemes is connected to the boundary. 23. The method of claim 12, further comprising segmenting all input variables such that the number of total partitions is the square of the number of used phonemes. 24. The method of claim 23, further comprising setting the number of individual neural networks to the same value as or less than the number of partitions by a phoneme combination. 25. The method of claim 12, further comprising classifying the optimum neural network having minimum errors using the following equation: wherein ci(Pj) is an optimum neural network index in an ith iteration for a jth phoneme combination (Pj), Wm is a section in which input variables included in the mth phoneme boundary are selected. 26. The method of claim 25, further comprising selecting the input variables included in the mth phoneme boundary of the Wm section according to the following equation: wherein tm is a nearest frame index from a position of the mth phoneme boundary. 27. The method of claim 25, wherein the total error of a kth neural network is given as a sum of square errors between a target output and an output obtained in the case of inputting all learning data included in the phoneme combination to the kth neural network. 28. The method of claim 12, further comprising determining a total error for the new partitions using the following equation: wherein i is an iteration times index and Si is a partition at an ith iteration. 29. The method of claim 28, further comprising determining the Si partition at the ith iteration using the following equation: description="In-line Formulae" end="lead"Si=s1iUs2i. . . Uski.description="In-line Formulae" end="tail" 30. The method of claim 12, further comprising updating for individual neural networks and learning neural networks according to partitions generated by the classifying and the merging and determining weight update values using the following equation: 31. The method of claim 12, further comprising determining whether all neural networks have converged by confirming whether there is a difference between a total error sum obtained from the current number of iterations and a total error sum obtained from the previous number of iteration using the following equation: description="In-line Formulae" end="lead"ΔD=|Di+1(Ci)-D i(Ci)|description="In-line Formulae" end="tail" wherein if a changing rate of a total error sum is smaller than the second threshold value, the learning is finished. 32. A method of learning and updating a neural network, comprising: preparing initial neural network combinations composing neural networks of multi-layer perceptron-type; searching a multi-layer perceptron index having a minimum error in the initial neural network combinations of all phoneme combinations; classifying merged phoneme combinations into new partitions by merging phoneme combinations with the same multi-layer perceptron index if the multi-layer perceptron index having a minimum error for all phoneme combinations is searched; and re-training neural networks to update weights by learning data applicable to each partition, wherein the re-training procedure of individual neural networks calculates errors using the updated weights and repeating the re-training until a changing rate of errors becomes smaller than a first threshold value. 33. The method of claim 32, further comprising repeating the searching, classifying, and re-training operations until the changing rate of a total error sum is smaller than a second threshold value when the total error sum obtained from the number of present iterations is compared to an error sum obtained from the number of previous iterations. 34. The method of claim 33, wherein the partition segmentation of the phoneme combination is performed without relation to linguistic knowledge.

이 특허에 인용된 특허 (13)

Kitazoe, Tetsuro; Kim, Sung-Ill; Ichiki, Tomoyuki, Acoustic speech recognition method and system using stereo vision neural networks with competition and cooperation.
상세보기
Chigier Benjamin (Brookline MA), Automatic speech recognition.
상세보기
Frank Mark S. (Chandler AZ) Garrison ; III Sidney C. (Tempe AZ), Decision directed adaptive neural network.
상세보기
Karaali Orhan (Rolling Meadows IL) Corrigan Gerald Edward (Chicago IL) Gerson Ira Alan (Schaumburg IL), Method and apparatus for converting text into audible signals using a neural network.
상세보기
Nussbaum Paul A., Method and apparatus for developing a neural network for phoneme recognition.
상세보기
Bergstrom Chad Scott ; Garrison ; III ; deceased Sidney Clarence, Method and apparatus for encoding speech using neural network technology for speech classification.
상세보기
Cole Ronald A. (Portland OR) Fanty Mark A. (Portland OR), Method and system for identifying and recognizing speech.
상세보기
Wang Shay-Ping T. (Long Grove IL), Method of training neural networks used for speech recognition.
상세보기
Bordeaux Theodore Austin, Multi-language speech recognition system.
상세보기
Stork David G. (Stanford CA) Wolff Gregory J. (Mountain View CA), Neural network acoustic and visual speech recognition system training method and apparatus.
상세보기
Mueller Paul H. (King of Prussia PA), Neural networks for acoustical pattern recognition.
상세보기
Lee Young Jik,KRX ; Suh Young Joo,KRX ; Yang Jae Woo,KRX, Phoneme dividing method using multilevel neural network.
상세보기
Mozer Forrest S. ; Mozer Michael C. ; Mozer Todd F., Speech recognition apparatus for consumer electronic applications.
상세보기

이 특허를 인용한 특허 (15)

Conkie, Alistair D.; Kim, Yeon-Jun, Automatic segmentation in speech synthesis.
상세보기
Weinstein, Eugene; Kumar, Sanjiv; Moreno, Ignacio L.; Senior, Andrew W.; Bhat, Nikhil Prasad, Caching speech recognition scores.
상세보기
Panda, Ashish; Kopparapu, Sunil Kumar, Computer implemented system and method for identifying significant speech frames within speech signals.
상세보기
Seide, Frank Torsten Bernd; Li, Gang; Yu, Dong; Eversole, Adam C.; Chen, Xie, Deep neural networks training for speech and pattern recognition.
상세보기
Printz, Harry; Chittar, Naren, Efficient empirical determination, computation, and use of acoustic confusability measures.
상세보기
Yu, Dong; Deng, Li; Seide, Frank Torsten Bernd; Li, Gang, Exploiting sparseness in training deep neural networks.
상세보기
Kim, Hyun-Soo, Method and system for segmenting phonemes from voice signals.
상세보기
Gruhn, Rainer; Vasquez, Daniel; Aradilla, Guillermo, Method for automated training of a plurality of artificial neural networks.
상세보기
Mukerjee, Kunal; Koishida, Kazuhito; Regunathan, Shankar, Noise robust speech classifier ensemble.
상세보기
Woo, Dong Hyuk, Scheduling neural network processing.
상세보기
Siohan, Olivier; Moreno Mengibar, Pedro J., Speech recognition using associative mapping.
상세보기
Chelba, Ciprian I., Speech recognition using non-parametric models.
상세보기
Chelba, Ciprian I.; Xu, Peng; Pereira, Fernando, Speech recognition using variable-length context.
상세보기
Senior, Andrew William; Chun, Byungha; Schuster, Michael, Speech synthesis using deep neural networks.
상세보기
Xu, Peng; Pereira, Fernando; Chelba, Ciprian I., Training acoustic models using distributed computing techniques.
상세보기

IPC	Description
A	생활필수품
A62	인명구조; 소방(사다리 E06C)
A62B	인명구조용의 기구, 장치 또는 방법(특히 의료용에 사용되는 밸브 A61M 39/00; 특히 물에서 쓰이는 인명구조 장치 또는 방법 B63C 9/00; 잠수장비 B63C 11/00; 특히 항공기에 쓰는 것, 예. 낙하산, 투출좌석 B64D; 특히 광산에서 쓰이는 구조장치 E21F 11/00)
A62B-1/08	.. 윈치 또는 풀리에 제동기구가 있는 것

내보내기 구분	파일저장 인쇄 메일전송
구성항목	기본정보 상세정보 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표IPC 관리번호, 국가코드, 자료구분, 상태, 출원번호, 출원일자, 공개번호, 공개일자, 공고번호, 공고일자, 등록번호, 등록일자, 발명명칭(한글), 발명명칭(영문), 출원인(한글), 출원인(영문), 출원인코드, 대표출원인, 출원인국적, 출원인주소, 발명자, 발명자E, 발명자코드, 발명자주소, 발명자 우편번호, 발명자국적, 대표IPC, IPC코드, 요약, 미국특허분류, 대리인주소, 대리인코드, 대리인(한글), 대리인(영문), 국제공개일자, 국제공개번호, 국제출원일자, 국제출원번호, 우선권, 우선권주장일, 우선권국가, 우선권출원번호, 원출원일자, 원출원번호, 지정국, Citing Patents, Cited Patents
저장형식	Text(ASCII format) Excel format PIAS분석(.xls)
메일정보	받는사람 (필수) @ 보내는사람 (선택) @ 제목 내용 KISTI 검색결과 이메일 서비스
안내	총 건의 자료가 검색되었습니다. 다운받으실 자료의 인덱스를 입력하세요. (1-10,000) 검색결과의 순서대로 최대 10,000건 까지 다운로드가 가능합니다. 데이타가 많을 경우 속도가 느려질 수 있습니다.(최대 2~3분 소요) 다운로드 파일은 UTF-8 형태로 저장됩니다. 파일의 내용이 제대로 보이지 않을실 때는 웹브라우저 상단의 보기 -> 인코딩 -> 자동선택 여부를 확인하십시오. ~ Text(ASCII format) Excel format

연합인증

Method of setting optimum-partitioned classified neural network and method and apparatus for automatic labeling using optimum-partitioned classified neural network 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

이 특허에 인용된 특허 (13)

이 특허를 인용한 특허 (15)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트

연합인증

Method of setting optimum-partitioned classified neural network and method and apparatus for automatic labeling using optimum-partitioned classified neural network 원문보기

초록 ▼

대표청구항 ▼

연구과제 타임라인

전체(0) 논문(0) 특허(0) 보고서(0)

전체(0) 논문(0) 특허(0) 보고서(0)

이 특허에 인용된 특허 (13)

이 특허를 인용한 특허 (15)

관련 콘텐츠

특허 원문 보기

IPC 상위 출원인

AI-Helper ※ AI-Helper는 오픈소스 모델을 사용합니다.

선택된 텍스트