In order to support the intelligent evaluation of Spoken English fluency in the field of green energy materials engineering, an automatic evaluation system based on semantic-prosodic collaborative modeling was proposed. The framework integrates speech transcription, term semantic representation, prosodic sequence encoding, grading score and error diagnosis feedback to jointly capture term usage, discourse advancement, pause, stress and speech rate changes. The corpus contains 4860 samples from 162 learners and 38 experts with synchronized audio, transcribed text, term labels, and prosodic annotations. Experiments show that SSPF-Net achieves 93.40% classification accuracy, 92.87% Macro-F1 and 0.918 QWK, which are better than the four baseline models, and gives consideration to scoring accuracy and deployment efficiency. At the same time, the average inference delay is 84 ms, which meets the requirements of edge device deployment. The system performance remains stable with only 3.6% scoring variance in the technical terms and noisy speech conditions. The feedback module can further locate semantic bias and prosodic imbalance, providing interpretable evaluation for computer-aided spoken English training in engineering scenarios.