Under the framework of stochastic economic Environment, building a robust monetary policy architecture requires abandoning rigid parameterised Heuristics for an adaptable Data-driven control Paradigm. Based on this research, a new approach has been developed to improve economic stability through periodic adjustments of the central bank’s open market operation target in response to multiple types of structural external environment-induced shocks efficiently. The macroeconomic Environment is modelled as a continuous state-space Markov Decision Process, with an optimisation of proximal policy to resolve inter-temporally inconsistent objectives of price stability, output gap minimization and interest-rate smoothness adjustment. Empirically simulated under a rigorous calibration of the empirical simulation system using comprehensive quarterly macroeconomic data sources, incorporating hidden nonlinearities and endogenous interactions not readily identifiable through linear-quadratic approximations common to traditional DSGE models. By means of extensive numerical experiments on the tested autonomous agents under various stochastic demands, supplies, and financial friction disturbances; It has been empirically verified that such algorithms are superior to Taylor-type reaction functions as well as optimal linear-control Strategies in minimising inter-temporal-losss over time. The autonomous policy agent has a strong ability for anticipation and adjustment of nominal trajectory ahead, so it accelerates the mean-reverting process of the economic system and reduces the degree of cyclic fluctuations. In the end, through an algorithm-based pathbreaking route to synthesising countercyclical monetary policies remains dynamically robust in response to significant macroeconomic instability.