In the globalization background, the requirement for intelligent evaluation and customized enhancement of English reading comprehension—a core ability for cross-cultural communication—has been becoming more and more pressing. Traditional assessment approaches, which depend on manual score-giving and unchangeable question collections, have such drawbacks as strong subjectivity, feedback with delay, and trouble in catching abilities of deep understanding. Moreover, their single-size-suits-all method cannot satisfy personalized learning demands. Transformer models, making use of self-attention mechanisms and global information modeling abilities, thus provide important technical support for construction of integrated assessment-recommendation systems. In order to deal with existing research gaps, this study carries out synthesis of relevant theories and makes use of experimental design and data analysis methods, therefore putting forward a Transformer-based Collaborative Attention (TBC) framework. This system framework takes in pre-trained models, constructs a multi-dimensional examination question database, and at the same time builds a user behavior data collection set. Through doing experiments, we can show that this model gets an evaluation accuracy number of 92.7% and an F1 score of 0.892. Therefore, for long-text processing work, its accuracy only has a 3.2 percentage point decrease thus. The personalized recommendation system carries out analysis of user behavior via multimodal feature fusion, and thus reaches a recommendation click-through rate of 78.3% as well as a matching accuracy of 89.4%. Compared with the control group, user scores have raised up by 18.7 points. Furthermore, the system can correctly recognize knowledge gaps for 83% of participants. This research carries out validation of the model’s effectiveness in the test of English reading comprehension ability and personalized recommendation, therefore hence offering new paths for the personalization of intelligent education. Future work may optimize modeling for low-frequency users and expand multi-source corpora.