
proxy nec essarily also selects for noise, 2) Extremal, where selection for the met-
ric pushes the sta te distribution into a region where old relationships no longer
hold, 3) Causal, where an action on the part of the regulator causes the col-
lapse, and 4) Adversarial, where an agent with different goals than the regulator
causes the collapse. These varied forms often occur together, but defining them
individually is useful. In doing so, this paper introduces and explains several
sub categories which differ in important ways.
To formalize the intuitive description, we consider a system S with a set
of possible sta tes s ∈ S. For the initial discussion we focus on a single actor,
the re gulator, who influences the system by selecting a permissable re gion of
the state-spac e , A ⊆ S. For this discussion, we will use Goal to refer to the
true goal of the regulator, which is a mapping fr om states G(s) → R for s ∈ S.
Because regulators have incomplete knowledge, they cannot act based on G(s)
and instead act only on a proxy M (s) → R for s ∈ S
2
. For simplicity’s sake,
we will consider actions where the regulator chooses some threshold c and the
permissible states are defined such that s ∈ A if M (s) ≥ c. This creates a
selection pressure that allows the first two Goodhart-like effects to occur
3
.
1 Regressional Goodhart
Regressional Goodhart - When selecting for a proxy measure, you select not
only for the true goal, but also for the difference between the proxy and
the goa l. This is also known as “Tails come apart.” [
5]
Simple Model:
M = G + normal(µ, σ
2
) (1)
Due to the noise, a point with a large M value will likely have a large G value,
but also a large noise value. Thus, when M is larg e , you can exp ect G to be
predictably smaller than M. Des pite the lack of bias, for large values of c the
values of G when M > c is expected to be higher than otherwise. While this
is the simplest Goodhart effect, it is also the most fundamental: it cannot be
avoided. No matter wha t measure is chosen for optimization, an inexa c t metr ic
necessarily leads to a divergence between the goal and the metric in the tail.
2 Extremal Goodhart
Extremal Goodhart - Worlds in which the proxy takes an extreme value
may be very different from the ordinary worlds in which the relationship
2
In general, a mapping from s → R is a measaure, and if used for decision-making, it
is known as a metric. T he current presentation assumes a single-dimensional case. Use of
multiple metrics and restrictions follows similar dynamics, but for discussing Goodhart effects
the single dimensional case is ideal.
3
This restriction of the available s tates is one form of selection pressure. There are other
forms of selection pressure which can apply, but these are unnecessary for presenting the
basic dynamics. For example, we often find that the states are chosen probabilistically and
the distribution can be influenced to prefer certain regions. One important such case is
evolutionary selection, where the most li kely states are generated based on a set of states
selected in a previous generation.
2