음성인식.음성합성.음성부호화.얼굴인식.애니메이션.HCI 프로세서 칩.Speech Understanding.Speech Synthesis.Speech Coding.Face Recognition.Animation.HCI Processor Chip.
초록▼
1. 구어 이해 기술 (1) 연속어 음성인식-6만 단어, 단어 인식률 : 93.4%(화자독립), 94.9%(화자적용) (2) 대화체 이해 기술-3,000의도(intention), 대화성공률 : 91% (3) 화자식별 기술-비고정 문장 화자식별 : 100명 화자, EER(Equal Error Rate) : 2.19% (4) HCI Processor 용 음성인식기 -1만 단어, 연속어 음성인식, 인식률 : 95% 2. 음성합성 및 신호처리 (1) 로롯 대화체 음성합성기 - 500MB, MOS평가
1. 구어 이해 기술 (1) 연속어 음성인식-6만 단어, 단어 인식률 : 93.4%(화자독립), 94.9%(화자적용) (2) 대화체 이해 기술-3,000의도(intention), 대화성공률 : 91% (3) 화자식별 기술-비고정 문장 화자식별 : 100명 화자, EER(Equal Error Rate) : 2.19% (4) HCI Processor 용 음성인식기 -1만 단어, 연속어 음성인식, 인식률 : 95% 2. 음성합성 및 신호처리 (1) 로롯 대화체 음성합성기 - 500MB, MOS평가 : 4.1 (2) HCI Processor 용 음성합성기 - 10MB, MOS평가 : 3.5 (3) 협대역/광대역 음성 부호화기 - 4kbit/s : G.723.1 5.3kbit/s 동등, 16kbit/s : G.722.1 48kbit/s 동등 (4) 자동 레이블러 - 좌우 20ms 허용범위 내에서 정확률 95.17% 3. 휴먼 인식 및 합성 (1) 얼굴 검출 및 인식 기술 . 얼굴 검출-초당 4frame 검출, 검출률 : 98% . 얼굴 인식-대용량 : 500명[95%], 표준화 : ANMRR[0.270], 240bits(Descriptor Size) . 3D Human Face Animation-24가지 근육, 44 action units, 10가지 입모양 4. HCI 프로세서 칩 (1) 멀티모달 지원 . 음성인식(1만 단어), 음성합성(10MB), 음성부호화기(4k/16k) 얼굴인식(10명) . 220만 Gate급 프로세서 H/W 구현 (2) HCI 칩 내의 H/W Engine(FFT,HMM,Image Convolution) (3) ARM920T MCU core 및 TeakLite DSP Core를 이용한 Dual Processor 구조 (4) One Chip(Max, 792Mips급), 저전력(300mw), 공정(0.25$\mum)
Abstract▼
** 1 st Year O Specifications o Baseline construction and problem definitions of each core technologies O Details o Baseline construction of large vocabulary continuous speech recognition system o Development of speaker verification algorithm o Preliminary study o
** 1 st Year O Specifications o Baseline construction and problem definitions of each core technologies O Details o Baseline construction of large vocabulary continuous speech recognition system o Development of speaker verification algorithm o Preliminary study on spoken language understanding o Implementation of narrow band speech codec (4kbit/s) o Preliminary study on wide band speech codec o Development of Face Detection an Verification system in office environment o Generation of 3-D standard face mode o Multi-processor Design Methodology Setup ** 2nd Year O Specifications o Baseline construction of preliminary system and core algorithm for processing speech/face image & emotion O Details o Development of 10K size LVCSR o Development of speaker verification o Designing prototype of spoken language understanding o International standardization of narrow band speech codec (ITU- T) o Building wide band speech codec (16~32kbit/s) o Emotion analysis and feature extraction o 3D human face modeling ~ facial expression o Fixed-point Modeling **3rd Year O Specifications o Spoken language understanding O Details o 40K size LVCSR (speed 3sec, accuracy 95%) o Designing spoken language understanding algorithm (vocabularies: 5,000 words) o fixed sentence speaker identification (100 persons, ERR 1 %) o Implementation of speech synthesis module for generation of dialog-type prosody o Implementation of narrow band variable bit rate speech codec (1 ~8kbit/s) o Implementation of high quality wide band speech codec (16kbit/s) o Development of large size face recognition (500 persons, 95%) o Development of animation for face expression o Micro Architecture development/architecture and verification **4th Year O Specifications o HCI Processor O Details o 60K size LVCSR (Speed: 3sec, accuracy 95%) o Spoken language understanding (speed: 2sec, accuracy 90%) o Text-free speaker verification (100 persons, ERR 1 %) o Speech synthesis with self- learning module o Optimization of narrow band speech codec o Optimization of wide band speech codec o Development of face emotion detection o Development of human animation system o Implementation of visual user interface o Building data acquisition and basic library ASIC o Setup micro architecture & design/development/verification development/design/verification
※ AI-Helper는 부적절한 답변을 할 수 있습니다.