One of the elements of technology that has become extremely critical within the field of education today is Deep learning. It has been especially used in the area of natural language processing, with some word-representation vectors playing a critical role. However, some of the low-resource language...
One of the elements of technology that has become extremely critical within the field of education today is Deep learning. It has been especially used in the area of natural language processing, with some word-representation vectors playing a critical role. However, some of the low-resource languages, such as Swahili, which is spoken in East and Central Africa, do not fall into this category. Natural Language Processing is a field of artificial intelligence where systems and computational algorithms are built that can automatically understand, analyze, manipulate, and potentially generate human language. After coming to discover that some African languages fail to have a proper representation within language processing, even going so far as to describe them as lower resource languages because of inadequate data for NLP, we decided to study the Swahili language. As it stands currently, language modeling using neural networks requires adequate data to guarantee quality word representation, which is important for natural language processing (NLP) tasks. Most African languages have no data for such processing. The main aim of this project is to recognize and focus on the classification of words in English, Swahili, and Korean with a particular emphasis on the low-resource Swahili language. Finally, we are going to create our own dataset and reprocess the data using Python Script, formulate the syllabic alphabet, and finally develop an English, Swahili, and Korean word analogy dataset.
One of the elements of technology that has become extremely critical within the field of education today is Deep learning. It has been especially used in the area of natural language processing, with some word-representation vectors playing a critical role. However, some of the low-resource languages, such as Swahili, which is spoken in East and Central Africa, do not fall into this category. Natural Language Processing is a field of artificial intelligence where systems and computational algorithms are built that can automatically understand, analyze, manipulate, and potentially generate human language. After coming to discover that some African languages fail to have a proper representation within language processing, even going so far as to describe them as lower resource languages because of inadequate data for NLP, we decided to study the Swahili language. As it stands currently, language modeling using neural networks requires adequate data to guarantee quality word representation, which is important for natural language processing (NLP) tasks. Most African languages have no data for such processing. The main aim of this project is to recognize and focus on the classification of words in English, Swahili, and Korean with a particular emphasis on the low-resource Swahili language. Finally, we are going to create our own dataset and reprocess the data using Python Script, formulate the syllabic alphabet, and finally develop an English, Swahili, and Korean word analogy dataset.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.