Article Title

Real-time control for fuel-optimal Moon landing based on an interactive deep reinforcement learning algorithm


fuel-optimal landing problem, indirect methods, deep reinforcement learning, interactive network learning, real-time optimal control


In this study, a real-time optimal control approach is proposed using an interactive deepreinforcement learning algorithm for the Moon fuel-optimal landing problem. Consideringthe remote communication restrictions and environmental uncertainties, advanced landingcontrol techniques are demanded to meet the high requirements of real-time performanceand autonomy in the Moon landing missions. Deep reinforcement learning (DRL) algorithmshave been recently developed for real-time optimal control but suffer the obstacles ofslow convergence and difficult reward function design. To address these problems, a DRLalgorithm is developed using an actor-indirect method architecture to achieve the optimalcontrol of the Moon landing mission. In this DRL algorithm, an indirect method is employedto generate the optimal control actions for the deep neural network (DNN) learning, while thetrained DNNs provide good initial guesses for the indirect method to promote the efficiencyof training data generation. Through sufficient learning of the state-action relationship,the trained DNNs can approximate the optimal actions and steer the spacecraft to thetarget in real time. Additionally, a nonlinear feedback controller is developed to improvethe terminal landing accuracy. Numerical simulations are given to verify the effectiveness ofthe proposed DRL algorithm and demonstrate the performance of the developed optimallanding controller.


Tsinghua University Press