In the context of the continuous growth in cross-border digital communication and multilingual information processing demands, the modeling and difference identification of ASEAN English multilingual corpora have become an important topic in intelligent translation research. This paper constructs a technical framework that integrates corpus collection, language identification, sentence-level alignment, subword segmentation, Transformer modeling, and intelligent comparative analysis. It introduces multi-head self-attention, difference scoring function, vector representation learning, and feature fusion mechanisms, and uses joint loss to achieve collaborative optimization of translation generation and difference discrimination. Experiments show that the model in this paper achieves a BLEU score of 39.63, a TER of 0.347, a semantic similarity of 0.879, and a difference identification accuracy of 92.3%; compared with Transformer, BLEU increases by 7.3% and the accuracy improves by 3.6 percentage points. This method can effectively reveal the translation differences between ASEAN English and Thai, Vietnamese, Indonesian, and Malay, and has practical significance for regional English variant computing research and intelligent translation analysis.
Povzetek: This study focuses on the intelligent comparison of multilingual corpora of ASEAN English, and constructs a neural machine translation framework that integrates Transformer semantic modeling, difference scoring, feature fusion, and joint optimization. Based on experiments with four types of corpora, the model’s BLEU score reaches 39.63, TER drops to 0.347, and the accuracy of difference identification reaches 92.3%, which can well reveal the differences in cross-language vocabulary, structure, and semantics.