Automated Aircraft Recovery via Reinforcement Learning: Initial Experiments

Part of Advances in Neural Information Processing Systems 10 (NIPS 1997)

Bibtex Metadata Paper


Jeffrey Monaco, David Ward, Andrew Barto


Initial experiments described here were directed toward using reinforce(cid:173) ment learning (RL) to develop an automated recovery system (ARS) for high-agility aircraft. An ARS is an outer-loop flight-control system de(cid:173) signed to bring an aircraft from a range of out-of-control states to straight(cid:173) and-level flight in minimum time while satisfying physical and phys(cid:173) iological constraints. Here we report on results for a simple version of the problem involving only single-axis (pitch) simulated recoveries. Through simulated control experience using a medium-fidelity aircraft simulation, the RL system approximates an optimal policy for pitch-stick inputs to produce minimum-time transitions to straight-and-Ievel flight in unconstrained cases while avoiding ground-strike. The RL system was also able to adhere to a pilot-station acceleration constraint while execut(cid:173) ing simulated recoveries.

Automated Aircraft Recovery via Reinforcement Learning