Conference Paper
BibTex RIS Cite

Using RDF Models to Create Knowledge Bases in the Kazakh Language: Comparison with Other Methods

Year 2023, Volume: 26, 633 - 640, 30.12.2023
https://doi.org/10.55549/epstem.1412449

Abstract

Currently, there is a rapid development of information technologies, the amount of information on the Internet is growing very fast and it is becoming increasingly difficult to find the necessary information. A search using keywords does not give results adequate to the meaning of the information sought. Therefore, the creation of a technology for designing intelligent question answering systems in the Kazakh language based on the presentation, processing and extraction of knowledge is a very actual problem, since it is in such a system that the linguistic and semantic relationships between the texts of the request and the answer can be taken into account. This research paper focuses on the integration of the Resource Description Framework (RDF) model, a semantic web technology, and provides a detailed evaluation of data mining techniques in Kazakh. The paper examines many Kazakh language data collection methods such as online scraping, community collaboration and translation. It also explores the function of RDF models in organizing knowledge, connecting data points and adding semantic richness to datasets. The paper discusses linguistic features and challenges unique to the Kazakh language and emphasizes the need to address these challenges with domain-specific data. The need for thorough cleaning, annotation and data quality assurance is emphasized to guarantee the reliability and use of the collected datasets. Within global communications and technology, the study emphasizes the importance of languages other than English and examines how semantic web technologies can improve data representation and knowledge retrieval. The study lays the groundwork for future initiatives to address the shortage of datasets in languages with fewer resources and to create semantic web technologies for language diversity.

References

  • Baeza-Yates, R., & Ribeiro-Neto, B. (1999). Modern information retrieval (Vol. 463, No. 1999). New York, NY: ACM Press.
  • Bekarystankyzy, A., Mamyrbayev, O., Mendes, M., Oralbekova, D., Zhumazhanov, B., & Fazylzhanova, A. (2023). Automatic speech recognition improvement for Kazakh language with enhanced language model. In Asian Conference on Intelligent Information and Database Systems (pp. 538-545). Cham: Springer Nature.
  • Bird, S. (2006). NLTK.: The natural language toolkit. In Proceedings of the COLING/ACL on Interactive Presentation Sessions, 69-72.
There are 3 citations in total.

Details

Primary Language English
Subjects Environmental and Sustainable Processes
Journal Section Articles
Authors

Assel Mukanova

Gulnazym Abdıkalyk

Aizhan Nazyrova

Assem Dauletkalıyeva

Early Pub Date December 30, 2023
Publication Date December 30, 2023
Published in Issue Year 2023Volume: 26

Cite

APA Mukanova, A., Abdıkalyk, G., Nazyrova, A., Dauletkalıyeva, A. (2023). Using RDF Models to Create Knowledge Bases in the Kazakh Language: Comparison with Other Methods. The Eurasia Proceedings of Science Technology Engineering and Mathematics, 26, 633-640. https://doi.org/10.55549/epstem.1412449