A Study on Intelligent Evaluation Methods for English Writing Proficiency Based on Semantic Feature Extraction

Liu, Pingping

doi:10.65102/is2026758

Research article

Ingegneria Sismica

Volume 43 Issue 2
Pages: 1
-20

A Study on Intelligent Evaluation Methods for English Writing Proficiency Based on Semantic Feature Extraction

Author(s): ^¹

¹School of Education, Quanzhou Vocational and Technical University, Quanzhou 362000, Fujian, China

Published: 30/04/2026

Cite

Liu, Pingping. “A Study on Intelligent Evaluation Methods for English Writing Proficiency Based on Semantic Feature Extraction.” Ingegneria Sismica Volume 43 Issue 2: 1-20, doi:10.65102/is2026758.

https://doi.org/10.65102/is2026758

Abstract

Addressing the issues of insufficient scoring granularity, weak cross-task stability, and unclear feedback evidence in university English writing instruction and intelligent grading scenarios, this paper proposes an intelligent evaluation method for English writing proficiency based on semantic feature extraction.First, a multi-source writing evaluation sample space is constructed around ASAP++, ICLE++, DREsS, and PERSUADE 1.0, and a unified mapping of composite scores, fine-grained traits, and rubric rules is established;Next, we establish a prompt-aware semantic feature extraction and multi-view joint scoring model, incorporating global semantic meaning of the prompt and essay, sentence-level coverage and local coherence, paragraph organizational relationships, and language support features into the scoring framework. By means of consistency constraint conditions, we have realized the combined forecast of the total score and five-dimensional meta-characteristics; In the end, experiments have been done by us under three evaluation schemes: inside-prompt, one-prompt-leave-out, and based-on-rubric. The results obtained by us indicate that our method attains a QWK of 0.862 and an MAE of 0.348 on the dataset ASAP++, attains a QWK of 0.821 for composite score and a trait-average QWK of 0.793 on the dataset ICLE++, and attains a Macro-F1 of 0.734 and a QWK of 0.796 on the dataset DREsS, hence all of these performances are better than the baseline methods. Experiments on low-resource conditions further prove that even when there is only twenty percent of training data used, the model still keeps a QWK of 0.713, hence this shows the model has good efficiency in utilizing samples. Case studies and error analysis indicate that this method can with reliability distinguish typical problems like inadequate topic answering, disconnected text development, and feeble language support, and hence has the possibility for being put into human-check working flows.

Keywords
English writing proficiency assessment; automated essay scoring; semantic feature extraction; multi-task joint scoring; cross-task generalization