Quickly distinguish between a normal state and various levels of natural gas pipeline leakage in leakage detection. Single-sensor methods generally have defects in all aspects: although acoustic emission (AE) signals are highly sensitive to rapid changes in jet-flow-induced vibration, they may be easily contaminated by external noise or have a high threshold; infrared thermography can obtain direct thermal images, but it is also overly sensitive to small amounts of heat accumulation or changes in the background temperature. To overcome the above deficiencies, this paper introduces an ELM-LPP multimodal fusion model that unites AE parameters and infrared thermograms in a shared latent space. The four components of the proposed model are: an extreme learning machine (ELM), locality-preserving projection (LPP), adaptive feature weighting and joint graph regularization. The pipeline leakage experiment is a carbon-steel loop that uses air as the safe substitute medium, has three valve-opening leakage levels, an AE sensor, and an infrared thermal imager. A total of 3,485 valid AE samples and 774 infrared images were obtained, and after temporal alignment, 774 AE-infrared pairs were created for fusion learning. Evaluation of the ELM-LPP model includes confusion matrices, class-wise F1 scores, multiclass accuracy and binary-subtask accuracy. F1 scores for normal, light leakage, moderate leakage and heavy leakage were 97.03%, 97.33%, 95.16% and 97.19% for multimodal input, respectively. The Multiclass Accuracy is 96.90% and the Macro F1 score is 96.68%. Multimodal fusion reduced leakage-state misclassification and improved discrimination of leakage severity compared with AE-only and infrared-only inputs. In the binary subtasks, the proposed method achieved ACC values of 1.0000 for normal vs. light leakage, 0.9123 for light vs. moderate leakage, and 0.9357 for moderate vs. heavy leakage, and was better than AWDR, HGSCCA, MvADL and OLFG. Therefore, based on the above analysis, acoustic-infrared data fusion can offer a more stable support for the early-stage detection of pipeline leakage than either single data source.