Deep learning’s strengths in data mining make it highly valuable for intelligent education applications. This paper annotates ideological and political knowledge points with relational labels and designs a visual-language multimodal interaction architecture to perform interactive preprocessing of knowledge point features. We construct a Knowledge Tracking Model (TCKT) based on interactive feature mining, which employs interactive feature embedding and a Learning Behavior Simulation (LBS) module to deeply track students’ ideological and political knowledge status and learning proficiency. After annotation, a summary of 30 knowledge points was completed. The TCKT model achieved scores exceeding 90% across all four metrics in knowledge tracking experiments. The TCKT’s prediction probability for exercise-concept interactions ranged from 0.61 to 0.69. Deep knowledge tracking achieved an 80.2% interpretability rate for students’ systematic learning of ideological and political education.