Automating penetration testing has been a challenge as it requires extensive expertise and experience from security professionals and usually requires a tedious manual testing process. In this study, with the help of an artificial intelligence model, a Markov decision process is constructed for describing the process of defining penetration tests. The objective is established to maximize the cumulative reward value and acquire an optimal strategy to guarantee the maximization of the expected return. Subsequently, the state and action reward functions are defined to accomplish penetration test modeling. A DQN algorithm founded on deep reinforcement learning is put forward to obtain the optimal strategy by learning the precise Q – function during environmental interactions. Moreover, a Dueling DQN algorithm known as empirical campaigning is proposed to more effectively handle the intricate state and action space.For the purpose of verifying the penetration success ratio of the arithmetic method this research employed, an aggressive person carries out an attack toward the object under the experimental environment. The attack measurement index of the penetration attack which this study carries out is comparatively high, reaching as high as 27.348. This effect exceeds the attack index values of the other two wide-used calculation methods.Comparing the optimal environmental reward values of the algorithms in different scenarios, the Dueling_DDQN algorithm is able to reach convergence in fewer training times, and reaches the desired reward value after 50 training times. It shows that the algorithm in this paper is able to achieve optimization and decision making for automated penetration test paths.