An adaptive optimization model of integrated traditional Chinese and Western medicine based on reinforcement learning was proposed to solve the problems of dependence on experience in plan adjustment, insufficient description of individual difference response, and continuous optimization of combined intervention in the process of integrated traditional Chinese and Western medicine. In this model, western medicine clinical indicators, TCM syndrome characteristics, historical treatment trajectories and patient stage feedback are integrated into the state-action-reward framework, and the dynamic update of combined treatment strategy is realized by using deep Q network. Based on anonymized clinical records and time-series follow-up data, a continuous treatment decision sample was constructed, and the index improvement ability, syndrome adjustment effect, synergy gain, Q value convergence, patient satisfaction and treatment interruption of the model were comprehensively evaluated. The results showed that in the model group, the average fasting blood glucose decreased by 1.47 mmol/L, the average glycosylated hemoglobin decreased by 0.83%, the total syndrome score decreased by 9.2 points, the synergy gain index reached 0.34 in the sixth week, the average satisfaction score was 4.47, and the cumulative interruption rate in the sixth week was controlled at 14.4%. The research shows that reinforcement learning can provide computational support for continuous decision-making and joint strategy updating in integrated traditional Chinese and western medicine treatment, and provide new technical solutions for intelligent optimization of clinical intervention pathways.