基于GAIL的自动驾驶:结合安全约束模仿人类驾驶行为
摘要
策系统仍然是一个重大难题。传统方法依赖于规则设计或强化学习的奖励函数,但存在泛化能力不足和安全冗余难以保
障的问题。尽管SHAI已显示出潜力,但它们通常需要复杂的架构和大量的标注数据。本文基于GAIL提出一个简单而有
效的方法,使用单一策略网络直接模仿人类驾驶行为,消除了对分层控制器的需求。然而,城市驾驶涉及多车动态交互、
潜在驾驶风格差异以及突发安全事件,需在GAIL基础上引入安全约束机制。本文结合安全约束优化了该框架,通过层次
化策略设计、奖励增强和潜在变量分离,实现城市环境中类人且安全的驾驶行为生成,以确保在动态场景中的无碰撞驾
驶。本文还使用Interaction数据集评估了该方法,并证明该方法在模仿准确性和安全性指标方面均具有优势。
关键词
全文:
PDF参考
[1]Ho, J., & Ermon, S. (2016). Generative adversarial
imitation learning. Advances in Neural Information Processing
Systems, 29, 4565-4572.
[2]Pinto, L., et al. (2017). Asymmetric actor-critic for
safe reinforcement learning. Proceedings of the 1st Annual
Conference on Robot Learning, 1-17.
[3]Zhan, E., et al. (2020). Interaction dataset: A
decentralized protocol for autonomous driving under partially
observable environments. IEEE Transactions on Intelligent
Transportation Systems, 22(4), 2077-2090.
[4]Li, Y., et al. (2021). SHAIL: Safety-aware hierarchical
adversarial imitation learning for autonomous driving.
Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition, 11245-11256.
[5]International Organization for Standardization (ISO).
(2022). Road vehicles—Safety of the intended functionality
(SOTIF): ISO 21448. Geneva: ISO.
Refbacks
- 当前没有refback。