Aiming at the problems of static teaching evaluation, lagging results and insufficient utilization of multi-source data in smart classroom of colleges and universities, a dynamic teaching evaluation model driven by deep reinforcement learning was constructed. A dynamic evaluation index system was established around the five elements of teachers ‘teaching behavior, students ‘classroom participation, the quality of teacher-student interaction, the use of resources and the effect of classroom feedback. Multi-modal data such as classroom video, voice, platform log, interactive text and classroom test were integrated to complete classroom state representation and time series modeling. On this basis, the Actor-Critic structure and PPO strategy optimization mechanism were introduced to realize the dynamic output and feedback update of classroom evaluation. The experiment was carried out based on 8 courses and 96 classroom records, and a total of 41280 state samples were formed. The results show that the accuracy, recall rate and F1 value of the proposed model reach 0.923, 0.909 and 0.915 respectively, the mean absolute error and root mean square error are 0.071 and 0.118 respectively, and the average response delay of a single time window is 0.11 s. The stability index and dynamic adaptability index reach 0.94 and 0.92, respectively, which are better than the baseline models such as AHP, SVM, LSTM and Transformer. The research shows that the model can better support the continuous perception, dynamic evaluation and strategy optimization of the smart classroom teaching process.