For solving the difficult problems existing in evaluating the body diathesis of vocational college students, for example strong subjective property, the mutual coupling of multi-dimension indexes, and the difficulty of direct connection between different property data, therefore this paper constructs an intelligent evaluation closed-loop system which is based on multimode data modeling. This research at the same time gathered visual skeleton flow, wearable inertia navigation (IMU) signals, and real-time heart rate, and deeply drew the topological connections between human biomechanical nodes and multi-source sensors through building a spatiotemporal graph attention network (ST-GAT). Different from the traditional single-score output methods, this system utilizes a multi-task characteristic decomposition tactic, divides low-level characteristics into three dimensions—body ability, techniques, and mind—for combined evaluation. Core experiments prove that the model obtains a prediction correct rate of 91.6%, a real-time answering time of 1.8 seconds, and system stable degree of 88.7% on a data collection that includes 6800 samples. This method successfully conquers the restrictions of artificial evaluation and thus gives accurate method support for the digital transformation and individual interference of physical education.