Outline

Ingegneria Sismica

Ingegneria Sismica

EdgeDistill: A Knowledge Distillation Approach for Deploying Large Language Models on Resource-Constrained Edge Devices in Industrial IoT

Author(s): Changan Chen1, Yan Ming1
1School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing, 400064, China
Chen, Changan. and Ming, Yan. “EdgeDistill: A Knowledge Distillation Approach for Deploying Large Language Models on Resource-Constrained Edge Devices in Industrial IoT.” Ingegneria Sismica Volume 43 Issue 2: 1-17, doi:10.65102/is2026876.

Abstract

The deployment of large language models (LLMs) on resource-limited edge equipment within Industrial Internet of Things (IIoT) scenarios encounters key difficulties that come from the extreme incompatibility between the huge calculation and memory demands of LLMs and the restricted hardware abilities of edge platforms. The currently existing model compression methods, which include uniform quantization and unstructured pruning, thus frequently bring about comparatively obvious performance decreasing on domain-specified industrial tasks, for example predictive maintenance, anomaly detection, and fault diagnosis. To overcome these limitations, this paper presents EdgeDistill, a task-adaptive knowledge distillation framework that efficiently transfers domain-specific knowledge from a large teacher LLM to a compact student model tailored for IIoT edge deployment. Firstly, one Industrial Semantic Alignment Distillation (ISAD) module is put forward, which uses a two-granularity alignment tactic that at the same time refines token-level logit distributions and sentence-level semantic expressions, therefore guaranteeing that the student model can loyally hold both fine-grained industrial terms and overall context comprehension. Second, one mechanism called Frequency-Aware Layer Selection (FALS) is brought in, which dynamically finds out and gives priority to the most information-containing middle layers of the teacher model for knowledge passing according to spectrum analysis of feature activation modes, therefore maximizing distillation efficiency while reducing calculation consumption. Third, a Hardware-Perception Adaptive Quantization-Distillation (HAQD) co-optimization module is designed, which in a unified training flow together carries out mixed-precision quantization and knowledge distillation, therefore making the student model can be compressed and knowledge-increased at the same time hence it obeys the special memory and delay restrictions of the target side edge hardware. At last, one Domain-Calibrated Evaluation Protocol (DCEP) has been built up, which brings in a complete group of IIoT-special measuring norms that include task correctness, inference time delay, energy use, and domain word loyalty to on the whole assess edge-placed language models. Experimental results on three IIoT datasets show that EdgeDistill achieves 96.2% of the teacher model’s performance while compressing the model by 51.1× (from 13.5 GB to 264 MB) and reducing inference latency by 24.7×, enabling real-time processing on edge devices like NVIDIA Jetson Nano and Raspberry Pi 4.

Keywords
Large Language Models; Knowledge Distillation; Edge Computing; Industrial IoT; Model Compression; Resource-Constrained Deployment

Related Articles

Junhua Li1, Xiaojie He1, Hua Liu2
1School of Mathematics and Computer Science, Hanjiang Normal University, Shiyan, 442000, Hubei, China
2School of Mathematics and Physics, Jingchu University of Technology, Jingmen 448000, Hubei, China
Wei Guo1, Peng Tao1, Bo Ling1, Shen Hao1, Nan Kai1
1State Grid Hebei Marketing Service Center, Shijiazhuang 050000, Hebei, China
Tianzi Zheng1, Genlang Chen2, Binhua He1
1School of Computer Science and Technology (School of Artificial Intelligence), Zhejiang Sci-Tech University, Hangzhou 310018, China
2School of Computer and Data Engineering, Ningbo Tech University, Ningbo 315199, China
Jingwen Wu1
1School of Business, Minnan Normal University, Zhangzhou 363000, Fujian, China
Zijie Peng1, Qianhua Xiao2
1JiLuan College, Nanchang University, Jiangxi 330031, Nanchang, China
2College of Information Engineering, Nanchang University, Jiangxi 330031, Nanchang, China