Chinese Space Science and Technology ›› 2024, Vol. 44 ›› Issue (5): 75-82.doi: 10.16708/j.cnki.1000-758X.2024.0075

Previous Articles     Next Articles

Improvement and application of MCTS in turn-based orbital games

ZHENG Xinyu,ZHANG Yi,ZHOU Jie,TANG Peijia,PENG Shengren,DANG Zhaohui   

  1. 1 Qian Xuesen Laboratory of Space Technology,China Academy of Space Technology,Being 100094,Chian
    2 School of Astronautics,Northwestern Polytechnical University,Xi′an 710072,China
  • Published:2024-10-25 Online:2024-10-21

Abstract:  The sensing delay of orbit change in turn-based orbit pursuitevasion game brings difficulties to differential game approaches,and deep reinforcement learning-based algorithms are still risky for engineering applications due to the inexplicability.The predictive-value-accumulate Monte Carlo tree search(PVA-MCTS) algorithm is proposed for the turn-based orbit pursuit-evasion game.Based on the predictability of spacecraft orbital motion,the algorithm predicts and accumulates the decision value in the game.This solves the problem of sparse reward and large time span in the turn-based orbit pursuit-evasion game,and improves the learning efficiency.It is used to solve the turn-based orbit pursuit-evasion game,and compared with the results obtained by Monte Carlo tree search(MCTS) algorithm.The results show that the PVA-MCTS algorithm reduces the pursuit time by about 27.6% and increases the escape time by about 6.8% for pursuer and evader respectively.The PVA-MCTS algorithm is realistic for the application of orbital game in the fields of non-cooperative target approaching and collision avoidance.

Key words: pursuit-evasion of spacecraft, turn-based pursuit-evasion game, Monte Carlo tree search, sensing delay of orbit change, predictive value accumulate