The existing cyber deception decision-making model based on game theory primarily focuses on the selection of spatial strategies, which ignores the optimal defense timing and can affect the execution of a defense strategy. Consequently, this paper presents a method for selecting deception strategies based on a multi-stage Flipit game. Firstly, based on the analysis of cyber deception attack and defense, we propose a concept of moving deception attack surface and analyze the characteristics of deception attack and defense interaction behaviors based on the Flipit game model. The Flipit game model is then utilized to create a single-stage deception spatial-temporal decision-making model. Additionally, we introduce the discount factor and transition probability based on a single-stage game model and construct a multi-stage cyber deception model. We provide the utility function of the multi-stage game model, and design a Proximal Policy Optimization algorithm based on deep reinforcement learning to compute the defender’s optimal spatial-temporal strategies. Finally, we utilize an application example to validate the effectiveness of the model and the advantages of the proposed algorithm in generating the multi-stage cyber deception strategy.
Loading....