Outline

Ingegneria Sismica

Ingegneria Sismica

Evaluation, Selection, and Deep Adaptation of General-Purpose Large Models for Power Industry Applications

Author(s): Ke Shi1, Jing Niu1, Weixiang Qiao1, Xing Zhang2
1Power Dispatching Control Center of Guizhou Power Grid Co., Ltd., GuiZhou, China
2Power Dispatching Control Center of Zunyi Power Supply Bureau of Guizhou Power Grid Co., Ltd., GuiZhou, China
Shi, Ke. et al “Evaluation, Selection, and Deep Adaptation of General-Purpose Large Models for Power Industry Applications.” Ingegneria Sismica Volume 43 Issue 2: 1-20, doi:10.65102/is20261026.

Abstract

Building a practical research model for evaluating, selecting, and adapting general-purpose large models in the two energy application scenarios of substations’ intelligent inspection and transmission corridors visualisation. The benchmark includes 18,642 retained multimodal evidence records that include: visible images; thermal frames; OCR string; equipment metadata; corridor attribute; rule clause; and historical ticket text. Anonymised six models were evaluated at set data divisions, prompts templates, inference upper limit and scoring script. Targeted power service judgment: Object localisation, risk inference, rule-based evidence, unsupervised alarm control, and robustness to field perturbations.Based on a weighted-score screen of the candidates, an adaptive selection result of the selected model included retrieval evidence, LoRA tuning, visual-grounding calibration, and safety verifier. Adapted Power-GM obtained the best comprehensive scores of 89.0%, 86.0%, 87.0%, 88.0% and 84.0%, respectively, for visual anchoring, risk judgement, rule obedience, hallucination suppression, and robustness. Eight of the selected tasks surpassed the most powerful open multimodal baseline by 9.9%.-20.3 percentage points and the closed multimodal baseline by 3.3-8.5 percentage points. The best response-surface area is LoRA ranking 48 and retrieval top-k 6, which still has a power-biz score of around 89.0% inside the latency bound. Ablation demonstrated that retrieval enhanced rule adherence, LoRA strengthened task reasoning, grounding calibration reduced object-Region misalignment, and the safety verifier decreased hallucinated risk assertions. This study is confined to the two tested scenes, with fixed model labels, task definitions, scoring scripts, test record retention policies, and only included the evidence types from the benchmark collection.8.5 percentage points. The best response-surface region was LoRA rank 48 with retrieval top-k 6, where the Power-Biz score remained near 89.0% within the latency target. Ablation showed that retrieval improved rule compliance, LoRA strengthened task reasoning, grounding calibration reduced object-region mismatch, and the safety verifier reduced hallucinated risk statements. The conclusion is limited to the two tested scenes, fixed model labels, fixed task definitions, fixed scoring scripts, retained test records, and the evidence types in the benchmark corpus.

Keywords
Large power grid model; Empirical verification; Model choice; Deep adaptation; Substation intelligent inspection; Transmission corridor visualisation.

Related Articles

Huiqiao Liu1
1Yinchuan University of Energy, Ningxia, 750000, China
Xin Zhao1, Yan Li1, Xiangyang Cao1, Qiushuang Li1, Jianing Zhang1
1State Grid Shandong Electric Power Company Economic and Technological Research Institute ShanDong JiNan 250001, China
Dan Yang1
1School of Marxism, Suzhou Polytechnic University, Suzhou, 215104, China
Liuhang Shen1, Xiangwen Sun1
1Ulster college at Shaanxi University of Science &Technology, Xi’an,710021, Shaanxi, China