Multiscript Language Technologies and Digital Inclusion: A Sociotechnical Study of Kazakh Script Conversion

Aigerim Sadykova; Nurlan Beketov; Madina Tursynbayeva

doi:10.63646/jtis.

Open Access PDF

Published 2025-09-30

Aigerim Sadykova*

Department of Information Systems, M. Auezov South Kazakhstan University, Shymkent, Kazakhstan
a.sadykova@auezov.edu.kz

Nurlan Beketov

Department of Computer Science, Korkyt Ata Kyzylorda University, Kyzylorda, Kazakhstan

Madina Tursynbayeva

Department of Sociology and Social Work, Shakarim University of Semey, Semey, Kazakhstan

DOI: https://doi.org/10.63646/jtis.

Abstract

Kazakh is a major Turkic language written and read through Arabic-based, Cyrillic-based, and Latin-based scripts across different communities, regions, archives, and digital platforms. This multiscript condition is often treated as a narrow technical problem of transliteration accuracy, yet it also shapes who can search for public information, access education, preserve family records, participate in e-government, and maintain linguistic identity in data-driven societies. This paper develops a sociotechnical study of Kazakh script conversion by connecting neural conversion methods, loanword-aware prompting, corpus governance, and digital inclusion. Building on a secondary analysis of recent Kazakh multiscript conversion benchmarks, the study reinterprets character error rate (CER) and word error rate (WER) as proxies for accessibility friction, institutional reliability, and cultural continuity. The analysis shows that prompt-constrained Transformer conversion substantially reduces word-level friction across six conversion directions but also reveals that model accuracy alone is insufficient for inclusive deployment. Script conversion systems affect people through interface design, standardization choices, metadata practices, education policies, data rights, and community trust. The paper contributes a multilayer framework that links script ecology, linguistic resources, conversion services, access settings, governance arrangements, and inclusion outcomes. It further proposes design principles for transparent, auditable, and community-sensitive Kazakh language technologies. The findings suggest that multiscript conversion should be governed not merely as an automation service but as digital public infrastructure for linguistic equity.

Keywords: Kazakh; multiscript conversion; digital inclusion; sociotechnical systems; language technology; loanword prompts; Transformer

This work is licensed under a Creative Commons Attribution 4.0 International License.

How to Cite

Sadykova, A., Beketov, N., & Tursynbayeva, M. (2025). Multiscript Language Technologies and Digital Inclusion: A Sociotechnical Study of Kazakh Script Conversion. Journal of Technology Innovation and Society, 3(3), 47-67. https://doi.org/10.63646/jtis.

Article sidebar

Main article

Abstract

Article details

How to Cite