Abbeel’s behavior was complicated, but his goals were simple; within a matter of seconds, the IRL system picked up on the paramount importance of not hitting other cars, followed by not driving off the road, followed by keeping right if possible.
Brian Christian • The Alignment Problem
BabyAGI differs in that it explicitly plans out a sequence of actions. It then executes on the first one, and then uses the result of that to do another planning step and update it’s task list. Our intuition is that this enables it to execute better on more complex and involved tasks, by using the planning steps essentially as a state tracking syst... See more
Autonomous Agents & Agent Simulations
The whole idea behind the Arcade Learning Environment—and the thrilling achievement of DQN—was that of a single algorithm, able to master dozens of completely different game environments from scratch, guided by nothing but the image on the screen and the in-game score.