In smart classrooms, learners engage with the instructor, peers, technological systems, and learning devices using multiple sensory modalities, resulting in massive multichannel data. In this paper, the fusion of intelligent classroom systems and multi-source learning data analysis is employed to establish a processing architecture for educational information in university-level smart teaching environments. In order to prove the empirical soundness of the framework, firstly, the cooperative learning participation assessment model for intelligent classrooms was designed, and then it was applied in real classroom teaching. Secondly, the SDM model is improved, and an ALSL1 method that introduces self-attention into short- and long-term user interest learning is proposed. The results show that the multi-source data, encompassing video recordings, audio, system logs, biometric signals, and self-reported information gathered from cooperative learning sessions within intelligent classrooms, function as characteristic indicators associated with student engagement. The evaluation based on multimodal data can comprehensively reflect the state of learning engagement of the students, thus enabling the teacher to implement targeted intervention in collaborative learning. Moreover, in contrast to alternative approaches, the ALSLI model exhibits superior performance in terms of interaction rate and retrieval accuracy indicators.