Dynamic Optimization of Monetary Policy and Suppression of Macroeconomic Fluctuations through Reinforcement Learning

Wang, Shizhen

doi:10.65102/is2026815

Research article

Ingegneria Sismica

Volume 43 Issue 2
Pages: 1
-20

Dynamic Optimization of Monetary Policy and Suppression of Macroeconomic Fluctuations through Reinforcement Learning

Author(s): ^¹

¹School of Finance and Accounting, Henan Logistics Vocational College, Zhengzhou, Henan, 450012, China.

Published: 30/04/2026

Cite

Wang, Shizhen . “Dynamic Optimization of Monetary Policy and Suppression of Macroeconomic Fluctuations through Reinforcement Learning.” Ingegneria Sismica Volume 43 Issue 2: 1-20, doi:10.65102/is2026815.

https://doi.org/10.65102/is2026815

Abstract

Under the framework of stochastic economic Environment, building a robust monetary policy architecture requires abandoning rigid parameterised Heuristics for an adaptable Data-driven control Paradigm. Based on this research, a new approach has been developed to improve economic stability through periodic adjustments of the central bank’s open market operation target in response to multiple types of structural external environment-induced shocks efficiently. The macroeconomic Environment is modelled as a continuous state-space Markov Decision Process, with an optimisation of proximal policy to resolve inter-temporally inconsistent objectives of price stability, output gap minimization and interest-rate smoothness adjustment. Empirically simulated under a rigorous calibration of the empirical simulation system using comprehensive quarterly macroeconomic data sources, incorporating hidden nonlinearities and endogenous interactions not readily identifiable through linear-quadratic approximations common to traditional DSGE models. By means of extensive numerical experiments on the tested autonomous agents under various stochastic demands, supplies, and financial friction disturbances; It has been empirically verified that such algorithms are superior to Taylor-type reaction functions as well as optimal linear-control Strategies in minimising inter-temporal-losss over time. The autonomous policy agent has a strong ability for anticipation and adjustment of nominal trajectory ahead, so it accelerates the mean-reverting process of the economic system and reduces the degree of cyclic fluctuations. In the end, through an algorithm-based pathbreaking route to synthesising countercyclical monetary policies remains dynamically robust in response to significant macroeconomic instability.

Keywords
Deep Reinforcement Learning; Monetary Policy Optimization; Macroeconomic Volatility Suppression; Proximal Policy Optimization; Non-linear New Keynesian Framework