
The Alignment Problem

we can use these snapshots to watch society change.
Brian Christian • The Alignment Problem
tension at the heart of curiosity, almost a tug-of-war: As we explore an environment and our available behaviors within it—whether that’s the microcosm of an Atari game, the real-world great outdoors, or the nuances of human society—we simultaneously delight in the things that surprise us while at the same time we become harder and harder to surpri
... See moreBrian Christian • The Alignment Problem
There is a broad assumption underlying many machine-learning models that the model itself will not change the reality it’s modeling. In almost all cases, this is false.
Brian Christian • The Alignment Problem
Caplan noted that while there are no legal penalties for ignoring such a tattoo, there may be legal problems if the doctors let a patient die without having their official DNR paperwork. As he puts it: “The safer course is to do something.”
Brian Christian • The Alignment Problem
Rather, we maintain our interest in things that seem to defy our expectations, that behave unpredictably, that dare us to try to understand what will happen next.
Brian Christian • The Alignment Problem
“If you are to act in an environment where [myopic] decision-making works,” says Lieder, “people will learn to rely on that system more and more.”
Brian Christian • The Alignment Problem
Perhaps the most impressive part of this expertise is our ability to infer others’ beliefs, but the foundation is inferring their intentions.
Brian Christian • The Alignment Problem
Modeling the world as it is is one thing. But as soon as you begin using that model, you are changing the world, in ways large and small.
Brian Christian • The Alignment Problem
This means that, in general, as our expectation fluctuates, we get differences between our successive expectations, each of which is a learning opportunity; Sutton called these temporal differences, or TD errors.