在机器人领域应用深度强化学习，目前主流的一些思路是什么？

在机器人领域应用深度强化学习，目前主流的一些思路是什么？ - 知乎

RelatedHighlights

强化学习是机器学习的一个分支。与监督学习不同,在强化学习中,智能体通过与环境不断交互进行试错学习,其目标是最大化累积回报。

小米技术 • Article

You can do it by learning how much reward certain states or actions can bring (“value” learning), or by simply knowing which strategies tend on the whole to do better than which others (“policy” learning).

Brian Christian • The Alignment Problem

我们可以通过确定agent是否了解环境模型来划分可用的RL算法。了解模型可以使agent提前知道状态转移概率矩阵和未来的reward

【重磅综述】用于机器人操作的深度强化学习- 知乎

Third, we ideally want to be learning not just after the fact but as we go along.