Causation does not explain contextuality
Sally Shrapnel and Fabio Costa
Centre for Engineered Quantum Systems, School of Mathematics and Physics, The University of Queensland,
St Lucia, QLD 4072, Australia
15th May 2018
Realist interpretations of quantum mechanics presuppose the existence of
elements of reality that are independent of the actions used to reveal them.
Such a view is challenged by several no-go theorems that show quantum correla-
tions cannot be explained by non-contextual ontological models, where physical
properties are assumed to exist prior to and independently of the act of meas-
urement. However, all such contextuality proofs assume a traditional notion
of causal structure, where causal influence flows from past to future accord-
ing to ordinary dynamical laws. This leaves open the question of whether the
apparent contextuality of quantum mechanics is simply the signature of some
exotic causal structure, where the future might affect the past or distant sys-
tems might get correlated due to non-local constraints. Here we show that
quantum predictions require a deeper form of contextuality: even allowing for
arbitrary causal structure, no model can explain quantum correlations from
non-contextual ontological properties of the world, be they initial states, dy-
namical laws, or global constraints.
Introduction
The appeal of an operational physical theory is that it makes as few unwarranted assump-
tions about nature as possible. One simply assigns probabilities to experimental outcomes,
conditioned on the list of experimental procedures required to realise these outcomes.
Ideally, such operational theories are minimal: procedures that cannot be statistically dis-
criminated are given the same representation in the theory. Quantum mechanics is an
example of such a minimal operational theory: all the statistically significant information
about the preparation procedure is contained in the quantum state, and the probabil-
ity of an event (labelled by a Positive Operator Valued Measure (POVM) element) does
not depend on any other information regarding the manner in which the measurement
was achieved (such as the full POVM). However, one of the most debated questions in
the foundations of the theory is whether one can go beyond this statistical level and also
provide an ontological description of some actual state of affairs that occurs during each
run of an experiment. That is, a statement about the world that tells us what is responsible
for the observed experimental outcomes.
The task of providing such an ontological model for quantum theory has proven to
be exceedingly difficult. A plethora of no-go theorems exists that describe the various
natural assumptions one must forgo in order to produce an ontological model that accords
Sally Shrapnel: s.shrapnel@uq.edu.au
Fabio Costa: f.costa@uq.edu.au
Accepted in Quantum 2018-05-04, click title to verify 1
arXiv:1708.00137v2 [quant-ph] 14 May 2018
with experiment. One such caveat is non-contextuality. Ultimately an apriori assumption,
non-contextual theories posit the existence of physical properties that do not depend on
the way they are measured. There is a large literature discussing the various ways one
may wish to cash out this notion more precisely. Broadly speaking, non-contextuality
no-go theorems fall into two distinct categories. Kochen-Specker style proofs show that
quantum measurements cannot be regarded as deterministically uncovering pre-existing,
or ontic, properties of systems [13]. Spekkens style proofs, on the other hand, show that
one cannot explain quantum statistics via ontological properties that mirror the context-
independence seen at the operational level [48]. While both approaches are well justified
and have led to interesting and relevant results, our own definition of non-contextuality
is more closely related to the latter. This particular view of non-contextuality can more
broadly be seen as an analogue of the no fine-tuning argument from causal modelling [9], an
analogue of Leibniz’s principle of the Identity of Indiscernibles [4,6], and a methodological
assumption akin to Occam’s razor.
Non-contextuality no-go theorems are not merely of foundational interest but can also
serve as security proofs for a range of simple cryptographic scenarios [10, 11], can herald
a quantum advantage for computation [12], and also for state discrimination [8]. Such
results, however, require the assumption of a fixed background causal structure; at the very
minimum, a single causal arrow from preparation to measurement. This leaves open the
question of whether one can produce a non-contextual ontological model by allowing for a
suitably exotic causal structure. Some authors attempt to explain quantum correlations by
positing backwards-in-time causal influences [1319], while others claim it is the existence
of non-local constraints that does the explanatory work [20,21]. The rationale in both cases
is that non-contextuality could emerge naturally in such models: physical properties might
well be “real” and “counterfactually definite”, but depend on future or distant measurements
because of some physically motivated—although radically novel—causal influence. Such
proposals do not fit neatly within the classical causal modelling framework, and so are not
ruled out by recent work in this direction [9,22], nor by any of the existing no-go theorems.
In this paper, we characterise a new ontological models framework to prove that even
if one allows for arbitrary causal structure, ontological models of quantum experiments
are necessarily contextual. Crucially, what is contextual is not just the traditional notion
of “state”, but any supposedly objective feature of the theory, such as a dynamical law
or boundary condition. Our finding suggests that any model that posits unusual causal
relations in the hope of saving “reality” will necessarily be contextual. Finally, this work
also represents a possible approach to how we ought to think of the generalised quantum
processes of recent work [2337]. It is clear that any ontological reading of such processes
will have to contend with the spectre of contextuality.
The paper is organised as follows. In section 1 we present the traditional ontological
models framework and clarify the rationale behind retrocausal explanations of quantum
statistics. In section 2 we introduce and justify the four primitive elements required to
define our operational model: local regions, local controllables, outcomes and an environ-
ment. In section 3 we define the three classes of operationally indistinguishable elements:
events, instruments and processes. In Section 4 we characterise instrument and process
non-contextuality according to these equivalence classes, and provide a generalised frame-
work for a non-contextual ontological model. As this is the conceptual heart of our result, in
Section 5 we clarify the scope and applicability of this framework via three examples. Using
standard quantum theory and results from previous work [29, 37], in section 6 we charac-
terise an operational model that accords with the experimental predictions of quantum
theory. Section 7 puts these elements together to prove that one cannot produce an on-
Accepted in Quantum 2018-05-04, click title to verify 2
tological model that is both process and instrument non-contextual and accords with the
predictions of quantum theory. In Section 8 we consider the constraints imposed on on-
tological models when one only assumes instrument non-contextuality. We finish with a
discussion.
1 An introduction to ontological models and retrocausal approaches.
The ontological models framework assumes that systems possess well defined properties at
all times [18,38,39]. The starting point is the very general claim that all experiments can
be modelled operationally as sets of preparations, followed by transformations, followed by
measurements, all performed upon some physical system. The set of all possible prepar-
ations, transformations and measurements is regarded as capturing the entire possibility
space of any experiment and can be associated with the operational predictions of a par-
ticular theory. For example, an experiment can involve choices of possible preparation
settings (labelled by the random variable P ) and choices of possible measurement settings
(M) with associated outcomes (a).
1
An operational model then predicts probabilities for
outcomes for all possible combinations of preparations and measurements:
a, M, P : p(a|M, P ). (1)
Such probabilistic predictions should coincide with the operational predictions of the
theory in question. For example, in the case of quantum theory each preparation choice
is modelled as a density operator (ρ
P
) on a Hilbert space associated to a quantum system
(H
A
). Similarly, each measurement choice M is associated with a positive operator valued
measure {E
a|M
}, whose elements correspond to particular outcomes a. The probabilities
predicted by the theory are:
p(a|P, M ) = Tr(ρ
P
E
a|M
). (2)
An ontological extension of such an operational model further assumes that the system
possesses well defined ontological properties between the time of preparation and measure-
ment. Such properties are collectively known as the "ontic state" and typically denoted
by λ. In the ontological models framework each preparation procedure P is presumed to
select a particular ontic state λ according to a fixed probability distribution: µ
P
(λ), and
each measurement choice is presumed to output a particular outcome according to a fixed
response function: ξ
a|M
(λ). That is, (i) every preparation P can be associated to a norm-
alised probability distribution over the ontic state space µ
P
(λ), such that
R
µ
P
(λ) = 1,
and (ii) every measurement M , with outcomes a, can be associated to a set of response
functions {ξ
a|M
(λ)} over the ontic states, satisfying
P
a
ξ
a|M
(λ) = 1 for all λ.
As the ontic states are not directly observed, the operational statistics are obtained via
marginalisation and we have:
a, M, P : p(a|M, P ) =
Z
ξ
a|M
(λ)µ
P
(λ)dλ, (3)
where for quantum theory:
a, M, P : Tr(ρ
P
E
a|M
) =
Z
ξ
a|M
(λ)µ
P
(λ)dλ. (4)
1
For this example we assume that any transformation between preparation and measurement is trivial.
Accepted in Quantum 2018-05-04, click title to verify 3
The ontological models framework has been used in numerous works to clarify the
manner in which quantum theory should be considered contextual [48, 40]. The key
assumption is that one can infer ontological equivalence from operational equivalence:
for example, if two preparation procedures produce the same distributions over outcomes
for all possible measurements, then any differences between them do not play a role in
determining the ontic states of the system in question. Thus, the justification for why
one can’t distinguish between the two equivalent preparations at the operational level is
because there is no difference between the role the preparations play at the ontological
level. The view is that each use of a preparation device selects one from a set of possible
ontic states according to exactly the same probability distribution in each run. Formally,
if M and outcome a
p(a|M, P
1
) = p(a|M, P
2
), (5)
then both preparations specify the same distribution over ontic states:
µ
P
1
(λ) = µ
P
2
(λ). (6)
Similarly, if two measurements result in the same outcome statistics for all possible pre-
parations then both measurements are represented by the same fixed response function.
Formally, if P and outcomes a
p(a|M
1
, P ) = p(a|M
2
, P ), (7)
then both measurements specify the same distribution over ontic states:
ξ
a|M
1
(λ) = ξ
a|M
2
(λ). (8)
Thus, in a non-contextual ontological model one can account for operational statistics
according to Eq. 3. Implicit in this model is the belief that the ontic state screens off the
preparations from the measurements (a property also known as λ-mediation [18]). In [40]
it was shown that one can use the assumptions above to derive a contextuality proof: no
model of the form of Eq. 3 can explain the statistics of quantum theory.
In this approach to contextuality [48, 40], one assumes that ontic states determine
correlations according to some fixed causal order. Formally, this is captured by Eq. 3: the
preparation is assumed to cause the selection of a particular ontic state λ according to a
fixed distribution µ(λ), and the measurement choice does not alter this value but merely
determines the outcome probability, also according to a fixed probability distribution. This
leaves open the question of whether one can explain the contextuality of quantum theory by
postulating an alternative, retrocausal ontology: If the future can affect the past, then the
state λ could depend on the measurement setting M, and Eq. (3) would not be justified.
Generally speaking, retrocausal approaches posit the existence of backwards-in-time
causal influences to explain quantum correlations. The stated appeal of such approaches
is that the consequent explanations retain some element of our classical notion of reality:
local causality, determinate ontology, and counterfactual definiteness. For example, Price
and Wharton explain Bell correlations by including a "zig-zag" of causal influence, passing
via hidden variables that travel backwards in time from one measurement event to the
source and then forwards in time to the distant measurement event [14]. Although not
explicitly stated, there is also one further assumption underlying these approaches: such
causal influences follow some kind of law-like behaviour. That is, one would not expect the
rules by which such retrocausal influences propagate, or backward-in-time states evolve,
to be completely ad hoc.
Accepted in Quantum 2018-05-04, click title to verify 4
As stated in the introduction, we follow the Spekkens-style approach and also define
non-contextuality in terms of operational equivalences. Where we depart however, is in
our particular choice of operational primitives. The usual primitives of preparations, trans-
formations and measurements do not permit one to consider causal scenarios that move
beyond the most simple causally ordered situations; in these models the notion of reality
is defined in terms of properties that exist before a measurement takes place. The under-
lying ontology is therefore assumed to follow some ordinary causal structure, akin to the
directed acyclic graphs of causal models [41]. In our model we wish to be able to con-
sider more general situations, for example where we include any possible global dynamics,
causal structure, space-time geometry or global constraints. In order to provide this al-
ternative perspective we consider the primitive operational elements to be sets of labelled
local regions, locally controllable properties and an environment.
2 Operational primitives
We define an operational model of any experiment to consist of local labelled regions
(A, B, C, . . . ) where one can perform controlled operations that can be associated with out-
comes. The regions align with concepts such as local laboratories, communicating parties
(e.g. Alice and Bob) and local space-time regions (similar, e.g., to the operational frame-
work of [42]). There is no apriori assumption that these regions be "fixed" or preassigned
in some manner; they are simply labels for the locus of a set of controlled operations. Con-
trolled operations generalise the notion of preparations, measurements, transformations,
and can include the addition or subtraction of ancillary systems. Examples include the
orientation of a wave-plate, the instigation of a microwave pulse, and the use of a photo-
detector. We call such local operations the local controllables. Each local controllable is
represented as
˜
I
X
, where the superscript X = A, B, . . . labels the associated region. We
consider outcomes as labels associated to the result of choosing a particular local control-
lable; the outcomes for region A are labelled a = 0, 1, 2, . . . . Examples include the number
of detected photons, the result of a spin measurement or the time of arrival of a photon.
We allow the outcomes to have infinite possible values as this enables us to use the same
variable for local controllables that have different numbers of possible outcomes. In general
however, we expect that only a finite number of such outcomes is associated with non-zero
probability.
Finally, we consider all the possible properties that could account for correlations
between outcomes in the local regions. These include any global properties, initial states,
connecting mechanisms, causal influence, or global dynamics. We call this the environ-
ment,
˜
W . Note that in our operational model environments and local controllables are
by construction always uncorrelated. That is, if we see a property change in relation to a
choice of local controllable we label this as an outcome and do not classify it as part of the
environment.
We can thus describe an experiment by a set of regions, outcomes, local controllables
and an environment. If we consider a particular run of an experiment there will in general
be a collection of outcomes that occur, one for each local region. One can associate a
joint probability to this set of outcomes and empirically verify probability assignments for
each possible set of outcomes. An operational model for such an experiment allows one to
calculate expected probabilities:
p(a, b, c, . . . |
˜
I
A
,
˜
I
B
,
˜
I
C
, . . . ,
˜
W ). (9)
The operational model thus specifies a distribution over outcomes for local controllables
Accepted in Quantum 2018-05-04, click title to verify 5
Figure 1: Operational primitives.
˜
I
A
,
˜
I
B
, . . . , and a shared environment
˜
W , Fig. 1. Note that it should be possible to
have ignorance over part of the environment and characterise this accordingly using the
operational model. More explicitly, if
˜
ξ represents the part of the environment about which
we are ignorant, then the operational probabilities given the known part of the environment
are obtained by marginalising over
˜
ξ:
p(a, b, c, . . . |
˜
I
A
,
˜
I
B
,
˜
I
C
, . . . ,
˜
W ) =
Z
d
˜
ξp(a, b, c, . . . ,
˜
ξ|
˜
I
A
,
˜
I
B
,
˜
I
C
, . . . ,
˜
W ) (10)
=
Z
d
˜
ξp(a, b, c, . . . |
˜
I
A
,
˜
I
B
,
˜
I
C
, . . . ,
˜
W ,
˜
ξ)p(
˜
ξ|
˜
W ),
where the second equality comes from the assumption that the local controllables are
uncorrelated with the environment.
As a concrete example,
˜
W can describe the axis along which a spin-
1
2
particle is pre-
pared, while
˜
ξ represents whether the spin is prepared aligned or anti-aligned with that
axis.
2
The marginal (10) then describes a scenario where there is some probabilistic un-
certainty of the spin’s direction i.e. which value of ξ occurs in any given run. Note that,
for the particular case p(
˜
ξ|
˜
W ) =
1
2
, we obtain the maximally mixed state irrespective of
the axis, making the variable
˜
W redundant. Such redundancies can be taken into account
via operational equivalences.
3 Operational equivalences
We next characterise the appropriate operational equivalences in order to define our onto-
logical model. Notationally, we omit the ‘tilde’ for each equivalence class.
2
Here (and again in Section 6), we take for simplicity a scenario with a single region where a meas-
urement is performed, so the specification of a process is equivalent to the specification of a state. More
generally, the variables W and ξ could describe quantum channels, quantum networks, or more general
quantum processes.
Accepted in Quantum 2018-05-04, click title to verify 6
3.1 Events
We say that a pair composed of an outcome and the respective local controllable (a,
˜
I
A
)
is operationally equivalent to the pair (a
0
,
˜
I
0A
) if the joint probabilities for a, b, c, . . . and
a
0
, b, c, . . . are the same for all possible outcomes and local controllables in the other regions
B, C, . . . , and for all environments
˜
W .
p(a, b, c, . . . |
˜
I
A
,
˜
I
B
, . . .
˜
W ) = p(a
0
, b, c, . . . |
˜
I
0A
,
˜
I
B
, . . .
˜
W ), (11)
(b, c, . . . ,
˜
I
B
,
˜
I
C
, . . . ,
˜
W ).
We denote an equivalence class of such pairs of outcomes and local controllables as an
event:
M
A
= [(a,
˜
I
A
)]. (12)
3.2 Instruments
We define an instrument as the list of possible events for a local controllable
˜
I
A
, where an
event M
A
= [(a,
˜
I
A
)] is possible for
˜
I
A
if
p(a, b, c, . . . |
˜
I
A
,
˜
I
B
, . . .
˜
W ) 6= 0, (13)
for some
(b, c, . . . ,
˜
I
B
,
˜
I
C
, . . . ,
˜
W ).
We say that
˜
I
A
is equivalent to
˜
I
0A
if they define the same list of possible events and we
denote the equivalence class I
A
:= [
˜
I
A
] {M
A
1
, . . . , M
A
n
}. Note that our definition allows
distinct instruments to share one or more events. Note also, our definition implies that the
probability for an event doesn’t depend on the particular instrument I, once we assume
the event is possible given the instrument. This property we call operational instrument
equivalence.
3
3.3 Process
The process captures those physical features responsible for generating the joint statistics
for a set of events, independently of the choice of local instruments. A process is defined
as an equivalence class of environments, W := [
˜
W ], where
˜
W is equivalent to
˜
W
0
, if
p(a, b, c, . . . |
˜
I
A
,
˜
I
B
, . . . ,
˜
W ) = p(a, b, c, . . . |
˜
I
A
,
˜
I
B
, . . .
˜
W
0
), (14)
(a, b, c, . . . ,
˜
I
B
,
˜
I
C
, . . . , ).
A simple example is the spatio-temporal ordering of regions. It is clear that the oper-
ational statistics of events in regions A and B can be different for the following two causal
orderings: (i) A is before B, (ii) B is before A; thus the respective environments,
˜
W
(i)
and
˜
W
(ii)
, will not be equivalent. On the other hand, for certain experiments we would
not expect any difference in statistics for a simple rotation of the whole experiment by 45
degrees; these two environments will be represented by the same process W .
The above equivalences allow us to define a joint probability distribution over the space
of events (rather than outcomes) conditioned on instruments (rather than local control-
lables) and the process (rather than the environment). As discussed above, this distribution
3
In other work, where we are not concerned with the possibility of non-contextual hidden variable
theories, we refer to this property as instrument non-contextuality [37]. Here we reserve the term non-
contextuality to refer to an ontological model.
Accepted in Quantum 2018-05-04, click title to verify 7
satisfies operational instrument equivalence, which means that the joint probability for a
set of events is either zero or independent of the respective instruments. Therefore, it can
be expressed in terms of a frame function f
W
that maps events to probabilities and is
normalised for each instrument:
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W ) = f
W
(M
A
, M
B
, . . . )
Y
X=A,B,...
χ
I
X
(M
X
), (15)
where, for a set S, χ
S
is the indicator function, χ
S
(s) = 1 for s S and χ
S
(s) = 0 for
s 6∈ S. Note that the indicator functions are necessary to make the whole expression a valid
probability distribution, normalised over the entire space of events. Furthermore, and in
contrast to similar expressions involving POVMs, the dependency on the instruments is
crucial to allow for causal influence across the regions: Integrating over the events of, say,
region A, can result in a marginal distribution that still depends on A’s instrument and
displays signalling from A to other regions. However, the fact that the dependency on
the instruments is solely through the indicator functions tells us that the causal relations
can be attributed to the particular events realised in each experimental run, rather than
to the whole instruments (which include the specification of events that did not happen).
In other words, the event “screens off the instrument: once the event in a local region is
known, further knowledge of the instrument does not allow for any better prediction about
events in other regions.
4 Ontological model
The purpose of an ontological model is to introduce possible elements of reality. Typically,
one assumes that the ontology is encoded in a “state”, representing the physical properties
of a system at a given time. Here we shift the focus from states to more general properties
of the environment that are responsible for mediating correlations between regions. We
represent the collection of all such properties by a single variable ω, named the ontic
process. We wish to clarify at this point that our ontic process captures the physical
properties of the world that remain invariant under our local operations. That is, although
we allow local properties to change under specific operations, we wish our ontic process to
capture those aspects of reality that are independent of this probing. The interpretation
of ontic processes and the relation with the usual notion of ontic states can be seen via the
examples of the following section.
Our ontological model specifies a joint probability for a set of outcomes, one at each
local region, given the ontic process, the environment, and the set of local controllables.
This joint probability reduces to the operational joint probability when the value of the
ontic process is unknown:
p(a, b, c, . . . , |
˜
I
A
,
˜
I
B
, . . .
˜
W ) =
Z
p(a, b, c, . . . , ω|
˜
I
A
,
˜
I
B
, . . .
˜
W ). (16)
There are three natural assumptions one might require of an ontological model defined
according to these operational equivalences:
Assumption 1. ω-mediation The ontic process mediates all the correlations between
regions, thus ω screens off outcomes from the environment, and we have:
p(a, b, c, . . . |
˜
I
A
,
˜
I
B
, . . . ,
˜
W ) =
Z
p(a, b, c, . . . |ω,
˜
I
A
,
˜
I
B
, . . . )p(ω|
˜
W ). (17)
Accepted in Quantum 2018-05-04, click title to verify 8
Assumption 2. Instrument non-contextuality. Operationally indistinguishable pairs
of outcomes and local controllables should remain indistinguishable at the ontological level.
That is, for operationally equivalent pairs (a,
˜
I
A
), (a
0
,
˜
I
0A
),
p(a, b, c, . . . , |ω,
˜
I
A
,
˜
I
B
, . . . ) = p(a
0
, b, c, . . . , |ω,
˜
I
0A
,
˜
I
B
, . . . ), (18)
(b, c, . . . ,
˜
I
B
, . . . ), ω.
which means that we can define a probability distribution on the space of events, conditioned
on instruments and on the ontic process, in terms of a frame function f
ω
, such that:
p(M
A
, M
B
, . . . |ω, I
A
, I
B
, . . . , ) =
Y
X
χ
I
X
(M
X
)f
ω
(M
A
, M
B
. . . ), (19)
where χ is the indicator function, χ
X
(x) = 1 for x X and χ
X
(x) = 0 for x 6∈ X, and
f
ω
maps events to probabilities:
f
ω
(M
A
, M
B
, . . . ) [0, 1], (20)
and is normalised for each set of events that corresponds to a particular instrument:
X
M
A
I
A
M
B
I
B
M
C
I
C
...
f
ω
(M
A
, M
B
, M
C
, . . . ) = 1. (21)
Assumption 3. Process non-contextuality.
For operationally equivalent processes
˜
W ,
˜
W
0
the assumption of process non-contextuality
implies:
p(ω|
˜
W ) = p(ω|
˜
W
0
), (22)
and we can define a function g
W
(ω) that maps ontic processes to probabilities, given each
process W :
g
W
(ω) = p(ω|
˜
W ), W =
h
˜
W
i
, (23)
that is normalised for all ω:
Z
g
W
(ω) = 1. (24)
For an ontological model that satisfies the above three assumptions, the operational
probability can now be expressed in terms of events, instruments and processes as:
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W ) =
Y
X=A,B,...
χ
I
X
(M
X
)
Z
f
ω
(M
A
, M
B
, . . . )g
W
(ω). (25)
Although ontic states, as they are usually understood, are not represented explicitly in
our framework, they are not excluded. In the following section we present three examples
to illustrate how such ontic states, with or without retrocausality, can be represented in
our model.
Accepted in Quantum 2018-05-04, click title to verify 9
5 Examples
5.1 Deterministic, classical models
5.1.1 Causally-ordered models
As a first example, let us consider a classical, deterministic scenario (without retrocausality)
with two regions, A in the past of B, each delimited by a past and future space-like
boundary, see Fig. 2a. For a classical system, we can assign input states λ
A
I
and λ
B
I
to the past boundaries of A and B, respectively, and output states λ
A
O
and λ
B
O
to the
respective future boundaries. As measurements can be performed without disturbance
on a classical system, we associate the input state in each region with the respective
measurement outcome: a λ
A
I
and b λ
B
I
. As local controllables we take deterministic
local operations, defined as functions f
X
that map the input state of each region to the
corresponding output:
λ
X
I
7→ λ
X
O
= f
X
λ
X
I
, (26)
where X denotes the respective local region, A or B. Assuming ordinary dynamical laws,
the input state at B can depend on the output at A through some function:
λ
A
O
7→ λ
B
I
= w
B
λ
A
O
. (27)
The input state at A, on the other hand, does not depend on B, and thus has to be
specified as an independent environment variable. The ontic process for this model is thus
identified with the pair
ω =
λ
A
I
, w
B
. (28)
Indeed, knowing ω and the choice of local operations is sufficient to fully determine the
measured outcomes:
a = λ
A
I
, b = w
B
f
A
λ
A
I

. (29)
As the model is fully deterministic, and we have not introduced any redundant variables,
there are no non-trivial equivalence classes. Explicitly, an event in region A (and similarly
for B) is given by the pair
a, f
A
, or equivalently by the input-output pair
λ
A
:=
λ
A
I
, λ
A
O
= f
A
λ
A
I

, (30)
while the instrument is given by the collection of events given a choice of operation,
I
A
=
n
λ
A
I
, f
A
λ
A
I
o
λ
A
I
, (31)
which is just to say the instrument can be identified with the function f
A
. We see in this
example that the ontology, as traditionally understood, lies in the event variables λ. These
variables are not independent of the local controllables, because the event at B can depend
on the operation performed at A. However, there is still an aspect of the ontology that
does not depend on the operations: the initial state λ
A
I
and the functional relation w
B
. It
is this invariant aspect of the ontology that we call a process.
5.1.2 Time-travelling classical systems
General Relativity allows for space-time geometries with closed time-like curves, where a
system can travel back in time and interact with its past self [43], thus providing physically-
motivated examples of scenarios that defy ordinary forward causality. Notably, qualitative
Accepted in Quantum 2018-05-04, click title to verify 10
(a) (b)
Figure 2: Classical process with ontological interpretation. (a) We assign input states λ
A
I
and λ
B
I
to the past boundaries of A and B, respectively, and output states λ
A
O
and λ
B
O
to the respective future
boundaries. A deterministic local operation is a function f
X
that maps the input state of each region
to the corresponding output. (b) An example of ontic process is one describing classical closed time-like
curves, defined by a pair of functions ω =
w
A
, w
B
, where λ
B
O
7→ w
A
λ
B
O
= λ
A
I
and similarly for
w
B
.
analogies between quantum phenomena and classical time-travelling systems have been
suggested [44], making the latter an interesting test-bed for generalised ontological models.
The example in the previous subsection can be readily generalised to a deterministic
model of classical system near closed time-like curves by allowing the input state at A to
depend on B through some function λ
B
O
7→ λ
A
I
= w
A
λ
B
O
. The process is now given by
two functions, ω
w
A
, w
B
, Fig. 2b, with the causally-ordered case recovered when one
of the two is a constant. Compatibility with arbitrary local operations imposes constraints
on the function w
A
, w
B
and, in the two-region case, it turns out that one of them has
in fact to be constant [45, 46]. However, for three or more regions, it is possible to find
deterministic processes, with no constant component, that are still consistent with arbitrary
local operations
4
.
Also in this case, the observed outcomes are fully determined once the process and
the local operations are specified, as the unique fixed points a λ
A
I
, b λ
B
I
, . . . , of
the function obtained by composing the process ω =
w
A
, w
B
, . . .
with the operations
f
A
, f
B
, . . .
(see Ref. [46] for more details). Crucially, in this case the events in each region
can depend on the choice of operation in all regions, λ
A
= λ
A
(f
A
, f
B
, . . . ). Thus, from
the perspective of ordinary ontological models, time-travelling systems appear contextual,
since it is impossible to assign a “state” to any region independently of the operations.
Nonetheless, the relation between events, captured by the process, does not depend on the
operations. Thus, following the terminology introduced here, models such as the above are
both instrument and process non-contextual. (As in the previous causally-ordered example,
there are no non-trivial equivalence classes, so non-contextuality is straightforward.)
More general models of classical closed time-like curves might impose restrictions on
the accessible local operations
5
. Even more generally, one can consider models where
4
The incompatibility of such processes with an underlying causal order can be demonstrated rigor-
ously by showing they can be used to violate causal inequalities [47], device-independent constraints on
probabilities imposed by a definite causal order [29, 48].
5
A constraint on the accessible operations is often invoked to solve “paradoxes”, such as a time-traveller
killing their past self. Although classical studies of closed time-like curves do not support the need for
this type of restriction [4953], it might be necessary in a general theory. Such a restriction on an agent’s
actions is sometimes interpreted as a violation of “free will”. This worry is however misplaced, since an
agent can still be (or fail to be) free to perform all the physically available operations. A different set of
Accepted in Quantum 2018-05-04, click title to verify 11
instruments are not associated with local input-to-output functions but with more general
sets of input-output pairs, I
A
= Λ
A
( Λ
A
I
× Λ
A
O
, where Λ
A
I(O)
is the state space associated
with the past (future) boundary of the local region. In such models, a choice of instrument
selects which pairs of input-output states are possible, while a deterministic process would
determine, given all choices of instruments, which pairs are actually realised. Thus, in
such models both the state in the past and in the future of a local region depend on the
choice of instrument, thus again they are necessarily contextual from the point of view of
traditional ontological models. Yet, they remain instrument and process non-contextual
as long as deterministic processes are considered.
In the above deterministic examples ω-mediation is satisfied trivially, because ontic and
operational processes coincide. This can be generalised to situations where we have only
partial knowledge about the environment. For example, we might not have full knowledge
of the initial state, but only know the temperature T of a thermal bath from which the
state is extracted; or the system might get coupled to some external environment during
the evolution from one region to another. In all cases, we end up with partial knowledge of
the ontic process, expressed by some probability p(ω|W ) where W represents all relevant
accessible information about the environment (the temperature of the bath or other noise
parameters). The resulting probabilistic operational model naturally satisfies the property
of ω-mediation, because knowing the temperature or noise parameters does not provide
more information than already encoded in the ontic process, namely in the underlying
microstates and functional relations.
Note also that our construction of an ontological model respects the mobility of the
boundary between local instruments and processes that one sees in ordinary applications
of quantum theory. As a simple example, consider a preparation P of a quantum system,
followed by a measurement M. This can be modelled in three different ways: (i) with
P as part of the environment
˜
W , and M as an instrument associated to a single local
region, (ii) with P and M as instruments in two distinct local regions, and
˜
W capturing
both a channel between preparation and measurement, plus any additional information
about the environment, or (iii) with both P and M characterising the instruments in a
single local region and all other information about the environment modelled as
˜
W . For
classical processes characterised as causal models, such a shift in perspective is formalised
by the notion of “latent variables” [41]. An analogue notion of "latent laboratories" exists
for quantum processes characterised as quantum causal models, and this formal structure
likewise characterises the mobility of the boundary to which we refer [34].
5.1.3 All-at-once stochastic models
In the above examples of time-travelling systems, the ontic process (or at least certain
aspects of it) can be understood as describing the dynamical evolution of systems between
regions. Some retrocausal approaches attempt to provide an ontology for quantum mech-
anics that does not rely on any dynamical process; rather, one should consider all relevant
events in space-time “at once”. The appearance of quantum probabilities is then justified
by the fact that the information available at a given time is not sufficient to fully determ-
ine the state of the system at all times (with the missing information possibly contained
in some unknown boundary condition in the future). Our framework naturally captures
all such models, because an ontic process need not be interpreted as a transformation: it
simply represents the rule generating all relevant events given the local operations.
operations would simply represent a deviation from classical physics in the local region where the agent
acts.
Accepted in Quantum 2018-05-04, click title to verify 12
An instructive example is a toy model by Wharton [16], which represents a space-time
scenario as a system in thermal equilibrium, with events at different space-time locations
represented as states at different points in space. While having a clear ontological in-
terpretation, this model offers qualitative analogies with quantum interference and, when
analysed from an ordinary time-evolution perspective, displays an apparent contextuality.
We show in detail in appendix A how (a generalisation of) Wharton’s model fits within
our framework and satisfies the requirements of ω-mediation and instrument and process
non-contextuality.
The above three examples illustrate that it is indeed easy to represent many pos-
sible physical scenarios via ontological models that are both instrument and process non-
contextual. Given the exotic nature of the latter two examples, it seems plausible that one
could also produce such a model to explain quantum correlations. In the following sections
we prove that this is not the case.
6 Quantum models
If one assumes that the results of experiments in local regions accord with quantum mech-
anics, then events can be associated with completely positive trace-non-increasing (CP)
maps M
A
: A
I
A
O
, where input and output spaces are the spaces of linear operators
over input and output Hilbert spaces of the local region, A
I
L(H
A
I
), A
O
L(H
A
O
) re-
spectively [54]. Each set I
A
of CP maps that sums to a completely positive trace preserving
(CPTP) map is a quantum instrument [55]:
Tr
X
M
A
I
A
M
A
(ρ)
= Tr(ρ). (32)
An instrument thus represents the collection of all possible events that can be observed
given a specific choice of local controllable.
Given these definitions of events and instruments, one can predict the joint probability
over possible events using a generalised form of the Born rule:
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W ) =f
W
(M
A
, M
B
, . . . )
Y
X
χ
I
X
(M
X
), (33)
f
W
(M
A
, M
B
. . . ) = Tr
h
(M
A
M
B
. . . )W
i
, (34)
where M
A
, M
B
. . . are the Choi-Jamiołkowski representations of the local CP maps as-
sociated to particular events, and W is a positive, semi-definite operator associated to
the relevant process [23, 26, 29]. We call W the process matrix, using the terminology of
Ref. [29].
It is possible to derive this trace rule for probabilities by assuming linearity [29], or
alternatively one can derive linearity (and the trace rule) from the assumption of opera-
tional instrument equivalence alone [37]. The significance of this latter derivation is that
the condition of operational instrument equivalence is formally identical to that of instru-
ment non-contextuality, with the only difference that the latter includes the ontic process.
Therefore, for each ontic process ω, the corresponding frame function can be expressed as:
f
ω
(M
A
, M
B
, . . . ) = Tr [σ(ω)M] , (35)
where we introduced the short-hand notation M M
A
M
B
. . . and σ(ω) is a process
matrix [37]. We now wish to show that the function g
W
(ω) that features in our ontological
Accepted in Quantum 2018-05-04, click title to verify 13
model, under the assumption of process non-contextuality, can be represented as
g
W
(ω) = Tr [η(ω)W ] , (36)
where {η(ω)}
ω
, being the set of ontic processes, is a quantum instrument.
It is common in non-contextuality no-go theorems (as well as in the process matrix
formalism) to assume preservation of probabilistic mixtures as an assumption that is in-
dependent of the assumption of non-contextuality. Here we rather derive it from our
assumption of process non-contextuality. Consider two classical variables ξ, W used to de-
scribe the process, where we already take operational equivalences into account. Following
the earlier example, we can think of W as describing a cartesian axis, while ξ—the aspect
of the process about which we are ignorant—describes whether a spin-
1
2
particle is pre-
pared aligned or anti-aligned to this axis. The operational probabilities given W , and the
corresponding decomposition for ontological probabilities, are obtained by marginalisation:
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W ) (37)
=
Z
p(M
A
, M
B
, . . . |ω, I
A
, I
B
, . . . , W, ξ)p(ω|I
A
, I
B
, . . . , W, ξ)p(ξ|I
A
, I
B
, . . . , W )
=
Z
p(M
A
, M
B
, . . . |ω, I
A
, I
B
, . . . )p(ω|W, ξ)p(ξ|W ),
where, in the last identity, we use the fact that p(ω|W, ξ) does not depend on the local
controllables (and thus on the instruments) due to the assumption of ω-mediation; and
p(ξ|W ) is due to our assumption that the environment and local controllables (and thus
process and instruments) are uncorrelated. Additionally, due to ω-mediation, we no longer
need to condition the M
A
, M
B
, . . . directly on W and ξ.
Now let us write W
ξ
for the process corresponding to the pair W, ξ. We have
g
R
W
ξ
p(ξ|W )
(ω) = g
W
(ω) = p(ω|W ) =
Z
g
W
ξ
(ω)p(ξ|W ), (38)
thus g
W
(ω) is convex-linear in W . The first identity in Eq. (38) comes from the fact that
probabilistic mixtures of quantum processes are represented as convex combinations, thus
W =
R
W
ξ
p(ξ|W ). This in turn is a consequence of the trace formula for operational
quantum probabilities (which is itself a consequence of operational instrument equivalence):
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W ) = Tr
h
(M
A
M
B
. . . )W
i
Y
X
χ
I
X
(M
X
) (39)
=
Z
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W, ξ)p(ξ|W )
Y
X
χ
I
X
(M
X
)
= Tr
(M
A
M
B
. . . )
Z
W
ξ
p(ξ|W )
Y
X
χ
I
X
(M
X
)
(40)
for all CP maps M
A
, M
B
, . . .
Using standard linear-algebra arguments, g
W
(ω) can be extended to a linear function
over W , leading to the representation (36), g
W
(ω) = Tr [η(ω)W ]. Positivity and normal-
isation of probabilities then imply
g
W
(ω) 0 η(ω) 0 ω, (41)
Z
g
W
(ω) = 1 Tr
Z
η(ω)W
= 1 W. (42)
Accepted in Quantum 2018-05-04, click title to verify 14
Operators η(ω) as defined above can be understood as the Choi representation of CP maps
that sum up to a trace preserving map, namely {η(ω)}
ω
defines an instrument. In
general, the CP maps η(ω) do not have to factorise over the separate regions, therefore it
might not be possible to interpret them as local operations. This is not an obstacle, as
such an interpretation is not required for the rest of the argument.
7 A quantum contradiction
To summarise the results so far, we have an operational rule for the predictions of the joint
probabilities of outcomes according to quantum theory:
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W ) =
Y
X
χ
I
X
(M
X
) Tr [M W ] . (43)
We also have an ontological model for predicting the joint probabilities under the assump-
tions of ω-mediation, instrument non-contextuality and process non-contextuality:
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W ) =
Y
X
χ
I
X
(M
X
)
Z
f
ω
(M
A
, M
B
, . . . )g
W
(ω), (44)
which given the results of the last section, becomes:
p(M
A
, M
B
, . . . |I
A
, I
B
, . . . , W ) =
Y
X
χ
I
X
(M
X
)
Z
[Tr σ(ω)M] [Tr η(ω)W ] . (45)
If this accords with quantum predictions then we should have:
Tr [M W ] =
Z
[Tr σ(ω)M] [Tr η(ω)W ] M, W. (46)
It has been noted [40] that a decomposition of the form (44) is akin to the expression
of expectation values in terms of quasi-probability distributions [56, 57]. However, the
non-contextuality assumptions force both f
ω
and g
W
to be ordinary, positive probability
distributions. It is well known that quantum expectation values cannot be expressed in
such a way. It is however instructive to consider an explicit contradiction within the present
process framework.
From (46),
Tr [M W ] = Tr
M
Z
σ(ω) g
W
(ω)
M (47)
W =
Z
σ(ω) g
W
(ω), (48)
which follows from the fact that M span a complete set of the joint linear space A
I
A
O
B
I
B
O
, . . .
Eq. (48) tells us that W is a convex mixture of the operators σ(ω). If W is extremal,
namely if it cannot be decomposed into a non-trivial convex combination of other processes,
then W σ(ω) for g
W
(ω) 6= 0. Denoting the support of g
W
by
W
, i.e., ω
W
g
W
(ω) 6= 0, we have W σ(ω) ω
W
for an extremal W .
Consider now a process W that can be decomposed into two distinct mixtures of two
sets of extremal processes W
j
and W
0
k
(we take discrete sets for simplicity):
W =
X
j
q
j
W
j
=
X
k
p
k
W
0
k
. (49)
Accepted in Quantum 2018-05-04, click title to verify 15
Since g
W
is convex-linear in W , we have g
W
=
P
j
q
j
g
W
j
. This means that, for every
ω
W
, there must be a j such that g
W
j
(ω) 6= 0. In other words,
W
=
S
j
W
j
.
By a similar argument, we have that
W
=
S
k
W
0
k
. We thus see that each convex
decomposition of W into distinct extremal processes corresponds to a partition of W ’s
support into the extremal processes’ supports. This in turns implies that each ω belongs
to both
W
j
and
W
0
k
, for some j and k. As we have seen, this would imply
σ(ω) W
j
W
0
k
. (50)
However, one can find many examples where no process in one decomposition is pro-
portional to any process in the other. This implies a contradiction and shows that a
decomposition such as (46) cannot exist for all CP maps and quantum processes. As a
particular example to show the above contradiction, consider a process W corresponding to
a quantum channel from a region with a two-level output, A
O
to a region with a two-level
input, B
I
:
W =
X
j
q
j
W
j
=
X
k
p
k
W
0
k
, (51)
formed from the following two combinations of extremal processes:
W
1
=[[1]] = 1 + X X Y Y + Z Z, (52)
W
2
=[[X]] = 1 + X X + Y Y Z Z, (53)
W
3
=[[Y ]] = 1 X X Y Y Z Z, (54)
W
4
=[[Z]] = 1 X X + Y Y + Z Z. (55)
W
0
1
=[[U 1]] = 1 + X U XU
Y UY U
+ Z UZU
, (56)
W
0
2
=[[U X]] = 1 + X U XU
+ Y UY U
Z UZU
, (57)
W
0
3
=[[U Y ]] = 1 X U XU
Y UY U
Z UZU
, (58)
W
0
4
=[[U Z]] = 1 X U XU
+ Y UY U
+ Z UZU
. (59)
where X, Y and Z are the Pauli matrices, U is a unitary, and we used the notation [[V ]] :=
P
rs
|ri hs| V |ri hs| V
for the Choi representation of a unitary V .
It is clear that no W
j
is proportional to any W
0
k
for an appropriate choice of U, and we
have a contradiction with (50).
8 Process-contextual extensions of quantum theory
Contextuality proofs do not always require both preparation and measurement non-contextuality.
Indeed, many no-go theorems focus on the requirement of measurement non-contextuality
alone. Interestingly, even without preparation non-contextuality, measurement non-contextuality
imposes strong constraints on the ontology. Essentially, any non-contextual ontology must
reduce to the Beltrametti-Bugajski (BB) model [58], which identifies elements of reality
with the quantum wave function. An important consequence of this result is that no
measurement non-contextual extension of quantum theory exists that can provide more
accurate predictions of experimental outcomes [5].
It is thus interesting to consider dropping the requirement of process non-contextuality
in our framework, leaving instrument non-contextuality as the sole requirement. It is easy
Accepted in Quantum 2018-05-04, click title to verify 16
to see that instrument non-contextual, process-contextual models are possible. An example
is a model where the ontic process is directly identified with the quantum process:
g
W
(ω) = δ (W ω) . (60)
Operational probabilities are then recovered simply by using the “quantum process rule”,
Eq. (34), for the ontic frame function:
f
ω
(M
A
, M
B
. . . ) = Tr [M ω] . (61)
This “crude” ontological model is similar to the BB model. A difference is that the BB
model only identifies pure quantum states with elements of reality, while in Eq. (60) any
process counts as ontic, including those corresponding to mixed states or noisy channels.
One could refine the above model by only allowing an appropriately defined “pure process”
to be ontic. (See however Ref. [59] for possible ambiguities regarding such a definition.)
A similar non-extendability result to that of [5] also holds in our case. As already dis-
cussed above, the only instrument non-contextual frame function must be given by Eq. (35),
namely to every ontic process ω is associated a process matrix σ (ω). The implication is
that an instrument non-contextual hidden variable cannot provide more information than
that contained in a process matrix. We thus conclude that quantum mechanics admits no
non-trivial, instrument non-contextual extension. Indeed, this result holds independently
of any assumptions one may make about the causal structure of a possible underlying
ontology. Therefore, even instrument non-contextuality alone poses strong restrictions
on hidden variable models that attempt to leverage exotic causal structures to recover a
non-contextual notion of reality.
9 Discussion
We have shown that it is not possible to construct an ontological model that is both
instrument and process non-contextual and also accords with the predictions of quantum
mechanics. We take both forms of non-contextuality to be very reasonable assumptions if
one wishes some aspect of "reality" to be describable in a manner that is independent of
the act of experimentation. Thus our work shows that models that posit unusual causal,
global or dynamical relations will not solve a key quantum mystery, that of contextuality.
Standard no-go theorems show that quantum theory is not consistent with ontological
models where the properties of a system exist prior to and independently of the way
they are measured. A possible interpretation is that properties do exist, but they are in
fact dependent on future actions. Here we have shown that hidden variable models that
attempt to leverage such influence from the future have to violate some broader form of
non-contextuality. This new notion of non-contextuality refers to the rules that dictate
how local actions influence observed events, rather than to states and measurements.
We have introduced three assumptions in order to analyse non-contextuality in such
scenarios where influence from the future is possible. The core idea is captured by the
assumption of ω-mediation. This states that an agent’s actions should effect the world
according to rules or laws that do not themselves depend on such actions. Indeed, if the
rules changed every time we changed how we intervened on the world, we would not call
them “rules” to begin with. In the context of ontological models, this assumption allows
one to assume that experiments uncover an aspect of nature that is unchanging.
The second assumption, instrument non-contextuality, states that operationally equi-
valent interventions should not produce distinct effects at the ontological level. We have
Accepted in Quantum 2018-05-04, click title to verify 17
shown that this assumption is compatible with scenarios that would be interpreted as con-
textual when viewed from an ordinary, time-oriented perspective. For example, we have
illustrated that time-travelling models where states can depend on future interventions sat-
isfy the requirement of instrument non-contextuality. Despite this generality, instrument
non-contextuality is nonetheless sufficient to rule out all non-trivial hidden-variable exten-
sions of quantum theory: Any additional variable that could provide better predictions for
quantum statistics than ordinary quantum mechanics must be instrument contextual.
Our third assumption, process non-contextuality, states that the probabilistic assign-
ment of the ontic description of an experiment should reflect the operationally equivalent
arrangements of the same experiment. Here by “experiment” we mean the specification of
the set of conditions under which agents can operate. That is, we include in this descrip-
tion all aspects of a physical scenario other than the choices of settings and the observed
outcomes. Such aspects include what kind of systems are involved, the laws describing such
systems, boundary conditions, etc. We have shown that no ontic model can satisfy this
requirement of process non-contextuality, including those that directly identify quantum
objects as ontic.
The distinction between background environment variables and locally controllable set-
tings that one makes when describing experiments using our approach is of course mobile.
What counts as a freely chosen parameter in one situation can count as a fixed parameter
in another. Our result is robust under such a shift in perspective: no matter how we decide
to describe a quantum experiment, it will not be possible to find an ontic representation
for it that is both instrument and process non-contextual.
Finally, we draw attention to the fact that our results rely on complete matching to the
operational predictions of quantum theory. This is a recognised feature of all ontological
models that rely on operational equivalence classes and leaves open the possibility that
particular ontological models might allow for some experimentally testable, different pre-
dictions. Thus, for proponents of particular retrocausal models, the door remains open to
develop their ontology such that they can predict some possible deviation from quantum
statistics. In the face of such statistical deviation, the possibility of a non-contextual
ontological model remains open.
Acknowledgments
We thank Časlav Brukner, Eric Cavalcanti, Ravi Kunjwal, Matthew Leifer, Gerard Mil-
burn, Alberto Montina, David Schmid, Robert Spekkens, and Ken Wharton for helpful
discussions. This work was supported by an Australian Research Council Centre of Ex-
cellence for Quantum Engineered Systems grant (CE 110001013), and by the Templeton
World Charity Foundation (TWCF 0064/AB38). F.C. acknowledges support through an
Australian Research Council Discovery Early Career Researcher Award (DE170100712).
This publication was made possible through the support of a grant from the John Tem-
pleton Foundation. The opinions expressed in this publication are those of the authors and
do not necessarily reflect the views of the John Templeton Foundation. We acknowledge
the traditional owners of the land on which the University of Queensland is situated, the
Turrbal and Jagera people.
References
[1] S. Kochen and E. Specker, “The problem of hidden variables in quantum mechanics,”
J. Math. Mech. 17, 59–87 (1967).
Accepted in Quantum 2018-05-04, click title to verify 18
[2] J. S. Bell, “On the problem of hidden variables in quantum mechanics,” Rev. Mod.
Phys. 38, 447–452 (1966).
[3] A. Cabello, Experimentally testable state-independent quantum contextuality,”
Phys. Rev. Lett. 101, 210401 (2008).
[4] R. W. Spekkens, Contextuality for preparations, transformations, and unsharp
measurements,” Phys. Rev. A 71, 052108 (2005).
[5] Z. Chen and A. Montina, “Measurement contextuality is implied by macroscopic
realism,” Phys. Rev. A 83, 042110 (2011).
[6] R. Kunjwal, “Contextuality beyond the Kochen-Specker theorem,”
arXiv:1612.07250 [quant-ph].
[7] M. D. Mazurek, M. F. Pusey, R. Kunjwal, K. J. Resch, and R. W. Spekkens, “An
experimental test of noncontextuality without unphysical idealizations,” Nat.
commun. 7, 11780 (2016).
[8] D. Schmid and R. W. Spekkens, “Contextual Advantage for State Discrimination,”
Phys. Rev. X 8, 011015 (2018).
[9] E. G. Cavalcanti, “Classical Causal Models for Bell and Kochen-Specker Inequality
Violations Require Fine-Tuning,” Phys. Rev. X 8, 021018 (2018).
[10] A. Chailloux, I. Kerenidis, S. Kundu, and J. Sikora, “Optimal bounds for
parity-oblivious random access codes,” New J. Phys. 18, 045003 (2016).
[11] R. W. Spekkens, D. H. Buzacott, A. J. Keehn, B. Toner, and G. J. Pryde,
“Preparation contextuality powers parity-oblivious multiplexing,” Phys. Rev. Lett.
102, 010401 (2009).
[12] M. Howard, J. Wallman, V. Veitch, and J. Emerson, “Contextuality supplies the
‘magic’ for quantum computation,” Nature 510, 351–355 (2014).
[13] H. Price, “Does time-symmetry imply retrocausality? How the quantum world says
“Maybe”?,” Studies in History and Philosophy of Science Part B: Studies in History
and Philosophy of Modern Physics 43, 75–83 (2012).
[14] H. Price and K. Wharton, “Disentangling the Quantum World,” Entropy 17,
7752–7767 (2015).
[15] P. W. Evans, H. Price, and K. B. Wharton, “New Slant on the EPR-Bell
Experiment,” Brit. J. Philos. Sci. 64, 297–324 (2013).
[16] K. Wharton, “Quantum States as Ordinary Information,” Information 5, 190–208
(2014).
[17] Y. Aharonov, E. Cohen, and T. Shushi, “Accommodating Retrocausality with Free
Will,” Quanta 5, 53–60 (2016).
[18] M. S. Leifer and M. F. Pusey, “Is a time symmetric interpretation of quantum theory
possible without retrocausality?,” Proceedings of the Royal Society of London A:
Mathematical, Physical and Engineering Sciences 473, (2017).
[19] R. I. Sutherland, “How retrocausality helps,” AIP Conference Proceedings 1841,
020001 (2017).
[20] A. Carati and L. Galgani, “Nonlocality of classical electrodynamics of point particles,
and violation of Bell’s inequalities,” Nuovo Cimento B 114, 489–500 (1999).
[21] S. Weinstein, “Nonlocality Without Nonlocality,” Found. Phys. 39, 921–936 (2009).
Accepted in Quantum 2018-05-04, click title to verify 19
[22] C. J. Wood and R. W. Spekkens, “The lesson of causal discovery algorithms for
quantum correlations: Causal explanations of Bell-inequality violations require
fine-tuning,” New J. Phys. 17, 033002 (2015).
[23] G. Gutoski and J. Watrous, “Toward a general theory of quantum games,” in
Proceedings of 39th ACM STOC, pp. 565–574. 2006. arXiv:quant-ph/0611234.
[24] G. Chiribella, G. M. D’Ariano, and P. Perinotti, “Quantum Circuit Architecture,”
Phys. Rev. Lett. 101, 060401 (2008).
[25] G. Chiribella, G. M. D’Ariano, and P. Perinotti, “Memory Effects in Quantum
Channel Discrimination,” Phys. Rev. Lett. 101, 180501 (2008).
[26] G. Chiribella, G. M. D’Ariano, and P. Perinotti, Theoretical framework for
quantum networks,” Phys. Rev. A 80, 022339 (2009).
[27] A. Bisio, G. Chiribella, G. D’Ariano, and P. Perinotti, “Quantum networks: General
theory and applications,” Acta Physica Slovaca. Reviews and Tutorials 61, 273–390
(2011). arXiv:1601.04864 [quant-ph].
[28] A. Bisio, G. M. D’Ariano, P. Perinotti, and M. Sedlák, “Optimal processing of
reversible quantum channels,” Physics Letters A 378, 1797 1808 (2014).
[29] O. Oreshkov, F. Costa, and Č. Brukner, Quantum correlations with no causal
order,” Nat. Commun. 3, 1092 (2012).
[30] K. Modi, “Operational approach to open dynamics and quantifying initial
correlations,” Sci. Rep. 2, 581 (2012).
[31] M. S. Leifer and R. W. Spekkens, “Towards a formulation of quantum theory as a
causally neutral theory of Bayesian inference,” Phys. Rev. A 88, 052130 (2013).
[32] M. Ringbauer, C. J. Wood, K. Modi, A. Gilchrist, A. G. White, and A. Fedrizzi,
“Characterizing Quantum Dynamics with Initial System-Environment Correlations,”
Phys. Rev. Lett. 114, 090402 (2015).
[33] F. A. Pollock, C. Rodríguez-Rosario, T. Frauenheim, M. Paternostro, and K. Modi,
“Non-Markovian quantum processes: Complete framework and efficient
characterization,” Phys. Rev. A 97, 012127 (2018).
[34] F. Costa and S. Shrapnel, “Quantum causal modelling,” New J. Phys. 18, 063032
(2016).
[35] J.-M. A. Allen, J. Barrett, D. C. Horsman, C. M. Lee, and R. W. Spekkens,
“Quantum Common Causes and Quantum Causal Models,” Phys. Rev. X 7, 031021
(2017).
[36] S. Milz, F. A. Pollock, and K. Modi, “Reconstructing open quantum system
dynamics with limited control,” arXiv:1610.02152 [quant-ph].
[37] S. Shrapnel, F. Costa, and G. Milburn, “Updating the Born rule,” New J. Phys. 20 ,
053010 (2018).
[38] N. Harrigan and R. Spekkens, “Einstein, Incompleteness, and the Epistemic View of
Quantum States,” Found. Phys. 40, 125–157 (2010).
[39] M. S. Leifer, “Is the quantum state real? An extended review of ψ-ontology
theorems,” arXiv:1409.1570 [quant-ph].
[40] R. W. Spekkens, Negativity and Contextuality are Equivalent Notions of
Nonclassicality,” Phys. Rev. Lett. 101, 020401 (2008).
Accepted in Quantum 2018-05-04, click title to verify 20
[41] J. Pearl, Causality. Cambridge University Press, 2009.
[42] O. Oreshkov and C. Giarmatzi, “Causal and causally separable processes,” New J.
Phys. 18, 093020 (2016).
[43] M. S. Morris, K. S. Thorne, and U. Yurtsever, “Wormholes, time machines, and the
weak energy condition,” Phys. Rev. Lett. 61, 1446 (1988).
[44] S. Durand, “An amusing analogy: modelling quantum-type behaviours with
wormhole-based time travel,” Journal of Optics B: Quantum and Semiclassical
Optics 4, S351 (2002).
[45] Ä. Baumeler and S. Wolf, The space of logically consistent classical processes
without causal order,” New J. Phys. 18, 013036 (2016).
[46] Ä. Baumeler, F. Costa, T. C. Ralph, S. Wolf, and M. Zych, “Reversible time travel
with freedom of choice,” arXiv:1703.00779 [quant-ph].
[47] Ä. Baumeler, A. Feix, and S. Wolf, Maximal incompatibility of locally classical
behavior and global causal order in multi-party scenarios,” Phys. Rev. A 90, 042106
(2014).
[48] C. Branciard, M. Araújo, A. Feix, F. Costa, and Č. Brukner, “The simplest causal
inequalities and their violation,” New J. Phys. 18, 013008 (2016).
[49] J. Friedman, M. S. Morris, I. D. Novikov, F. Echeverria, G. Klinkhammer, K. S.
Thorne, and U. Yurtsever, Cauchy problem in spacetimes with closed timelike
curves,” Phys. Rev. D 42, 1915–1930 (1990).
[50] F. Echeverria, G. Klinkhammer, and K. S. Thorne, Billiard balls in wormhole
spacetimes with closed timelike curves: classical theory,” Phys. Rev. D 44,
1077–1099 (1991).
[51] A. Lossev and I. D. Novikov, “The Jinn of the time machine: nontrivial
self-consistent solutions,” Class. Quantum Grav. 9, 2309 (1992).
[52] I. D. Novikov, “Time machine and self-consistent evolution in problems with
self-interaction,” Phys. Rev. D 45, 1989–1994 (1992).
[53] E. V. Mikheeva and I. D. Novikov, “Inelastic billiard ball in a spacetime with a time
machine,” Phys. Rev. D 47, 1432–1436 (1993).
[54] M. Nielsen and I. Chuang, Quantum Computation and Quantum Information.
Cambridge University Press, 2000.
[55] E. Davies and J. Lewis, “An operational approach to quantum probability,” Comm.
Math. Phys. 17, 239–260 (1970).
[56] E. Wigner, “On the Quantum Correction For Thermodynamic Equilibrium,” Phys.
Rev. 40, 749–759 (1932).
[57] M. Scully and M. Zubairy, Quantum Optics. Cambridge University Press, 1997.
[58] E. G. Beltrametti and S. Bugajski, “A classical extension of quantum mechanics,”
J. Phys. A: Math. Gen. 28, 3329 (1995).
[59] M. Araújo, A. Feix, M. Navascués, and Č. Brukner, “A purification postulate for
quantum mechanics with indefinite causal order,” Quantum 1, 10 (2017).
Accepted in Quantum 2018-05-04, click title to verify 21
A Wharton’s retrocausal toy model
The core idea of the model is to represent a system across space-time, analogously to the
representation of a system in space in thermodynamical equilibrium. Rather than being
determined by dynamical evolution, the states at each point in space-time are known
with some probability. This is similar to how macrostates can be considered as providing
probability distributions for microstates.
In this model each event in space-time is represented as a site, labelled by the index j,
within a lattice. At each site j we can have a particle in a state λ, whose possible values
are assumed to be ±1 for simplicity. The entire system across space-time is treated
“all-at-once” in the same way one would treat a spatially extended system, where each
site represents a different location in space. The system is then associated with a
Hamiltonian H =
P
<i,j>
λ
i
λ
j
, where the sum is taken over nearest-neighbours
according the geometry of the lattice. All we know about the system is that it is in a
thermal state, with inverse temperature β, thus the probability for a certain
configuration
~
λ := (λ
1
, λ
2
, . . . ) is p(
~
λ|β) e
H(
~
λ)β
. If we learn the state of one of the
sites, we need to update the thermal distribution by conditioning on the observed value.
However, since the model is supposed to represent a space-time configuration, the sites
we can observe at any given time are restricted.
Figure 3: Wharton’s toy model [16]. Each node j represents a location in space-time where a system
can be found in a state λ
j
, j = 1, 2, . . . . The state of the entire system is sampled from a thermal
ensemble, defined by a Hamiltonian containing interactions between nodes connected by an edge, where
each node is treated as a site in a spatially distributed lattice. (a) Observing the system at a given
time reveals the state at one of the nodes, e.g. λ
1
= 1, upon which the probability assignment at the
other nodes has to be updated. (b) The analogue of an interference experiment is represented by the
insertion of an additional node in the future, which results in a different thermal state and thus in a
different probability distribution for all states. An observer at an earlier time that ignores this possibility
might interpret such a dependence from future actions as a form of contextuality.
Retrocausality is introduced by assuming that performing a measurement at any given
time can result in the introduction of a new site, thus changing the geometry of the
Accepted in Quantum 2018-05-04, click title to verify 22
system, Fig. 3. Assuming a thermal state with a given temperature, the two geometries
result in different probability distributions for the microstates. If the system is
interpreted as time oriented, and the influence of the future intervention is ignored, then
one might be led to the conclusion that it is impossible to assign non-contextual states of
reality to the system. The analogy is seen with a quantum interference experiment,
where a measurement in the future is assumed to change the conditions that determine
the state of the system in its past. If the influence from the future measurement is
included, argues Wharton, then one might be able to recover an ontic interpretation of
quantum mechanics, where the quantum state simply represents lack of information
about the underlying state.
This model is interesting because causal influence is not mediated by an explicit
mechanism, as opposed to ordinary dynamical systems including the time-travelling
examples in the main text. Nonetheless, it is possible to fit this model into our general
ontological framework, where the observed probabilities are mediated by an ontic process.
Crucially, the model turns out to be both instrument and process non-contextual,
showing that approaches of this type cannot reproduce the predictions of quantum theory.
Classical systems on an arbitrary geometry
We consider a more general version of Wharton’s model, with arbitrary geometry, an
arbitrary set of discrete values for the states, and arbitrary local interactions.
Consider a set N of |N | = N sites. Each site j N can contain a classical system whose
state λ
j
can take value in some set S
j
. The state of the entire system is thus described by
a vector
~
λ (λ
1
, . . . , λ
N
) S :=
×
j∈N
S
j
.
A Hamiltonian function H(
~
λ) is defined on the system. We assume that this Hamiltonian
is local, namely it is a sum of terms representing local interactions between sites. A subset
of sites e N contributing to an interaction term is called “hyperedge” and the set E of
hyperedges defines a “hypergraph” over N . The Hamiltonian can thus be decomposed as
H =
X
e∈E
h
e
, (62)
where each term h
e
is function on the space L
e
:=
×
je
S
j
. By convention, we identify
the state λ
j
= 0 of system j with the “empty site”, namely with no system in it. This
implies that, for every hyperedge e containing j,
h
e
(. . . , λ
j
= 0, . . . ) = 0. (63)
In other words, each interaction term vanishes when one of the sites on which it acts is
empty. In this way, “different geometries” corresponding to additional or missing sites,
are simply represented as a particular choice of states in a fixed geometry.
In our terminology, each site j represents a (space-time) region and each state λ
j
represents an event. We can interpret each event as “ontic”; however, since we assume
that each ontic event can also be observed, ontic and operational events are identified.
No control. Before considering the possibility of interventions, it is useful to see how
our framework applies to the simpler scenario with no interventions. In this case,
“process” is synonymous with “state”. Thus, a deterministic process is simply a specific
microstate
~
λ, while a general probabilistic process is a probability distributions P
~
λ
.
For the case of a thermal state, where the only information we can access about the
Accepted in Quantum 2018-05-04, click title to verify 23
environment is the inverse temperature β, the operational process is probabilistic, given
by the Gibbs distribution
p(
~
λ | β) =
e
βH
(
~
λ
)
Z(β)
, Z(β) =
X
~
λ∈S
e
βH
(
~
λ
)
. (64)
Since there are no irrelevant environment variables in this model, questions about
contextuality do not arise: each value of the environment variable corresponds to just one
process (i.e. to one probability distribution for the “events”). Therefore, at the formal
level, we could identify the operational process with the ontic process; the resulting
model would be process non-contextual by construction (instrument non-contextuality is
even more trivial here, because there is no choice of instruments). A more natural
ontological model is a deterministic one, where each ontic process (or ontic state) is
identified with one microstate
~
λ. As required by the general formalism, the operational
process provides a probability distribution over the possible ontic processes, and knowing
the ontic process makes knowledge of the operational process redundant (the ontic
process “screens off the operational one), in agreement with the property of ω-mediation.
Local control. Local instruments are defined as subsets of events and represent the
possibility of local control. Thus, in general, the possible sets of instruments at a site
j N corresponds to a subset I
j
S
j
. As a simple case-study, we consider the scenario
where the only control is inserting or removing a site, as in Wharton’s example.
Therefore, for each region j N there are two possible instruments:
I
j
0
:= {0} , I
j
1
:= S
j
\ {0}. (65)
A prominent feature of this example is that instruments are disjoint sets, so there is
never an event that belongs to two distinct instruments. This ensures the instrument
non-contextuality of the model.
As in the no-control case, a deterministic process corresponds to a specification of all
events, while a probabilistic process corresponds to a probability distribution for the
possible events. The possibility of control means that the events now can depend on the
instruments, so the process must encode this dependency. Thus, a deterministic process
is given by a set of functions
~
I
I
1
, . . . , I
N
7→
~
λ, λ
j
= ω
j
(
~
I), j N (66)
such that, for each j N ,
ω
j
. . . , I
j
= I
j
0
, . . .
= 0, ω
j
. . . , I
j
= I
j
1
, . . .
6= 0. (67)
Condition (67) simply says that if we choose to remove the system from site j (I
j
= I
j
0
),
then there will be no system at site j (λ
j
= 0), while if we choose to insert the system
(I
j
= I
j
1
) then the system will be there, in one of its possible states (λ
j
S
j
\ {0}).
For a probabilistic process, dependency on the instruments is encoded in a conditional
probability distribution p(
~
λ |
~
I). The probabilistic version of the consistency condition
(67) reads
P
. . . , λ
j
6= 0, . . . | . . . , I
j
= I
j
0
, . . .
= 0, (68)
P
. . . , λ
j
= 0, . . . | . . . , I
j
= I
j
1
, . . .
= 0.
Accepted in Quantum 2018-05-04, click title to verify 24
A more compact way to represent a (non-contextual) process is through a frame function,
which can be defined piece-wise as:
f(
~
λ) := p(
~
λ |
~
I) for λ
1
I
1
, . . . , λ
N
I
N
. (69)
The consistency condition (68) is then expressed as
p(
~
λ |
~
I) = f(
~
λ)
Y
j∈N
χ
I
j
(λ
j
). (70)
Let us pause for a moment on this definition. The frame function is defined as a function
f : S [0, 1]. That is, it assigns a probability to each N-tuplet
~
λ = (λ
1
. . . , λ
N
), without
needing any additional information about the instruments. In the case we are
considering, different instruments are non-overlapping sets of states. Therefore, if we
know the state λ
j
, we automatically know the instrument I
j
and requiring “independence
from the instruments” is completely trivial: either the site is there, and the instrument is
I
j
1
, or the site is not there, and I
j
= I
j
1
. Once we know the state, there is nothing more
the instrument can tell us. Technically, each value
~
I defines a subset in the domain of f,
and the value f takes in each of these subsets is given by the conditional probability (69).
The non-overlapping of different instruments is crucial for this construction: if the same
~
λ could belong to two different instruments, we would not know which value of p(
~
λ |
~
I) to
use to define the frame function. For overlapping instruments, the existence of a frame
function is equivalent to the assumption of instrument non-contextuality.
Processes for the thermal state Once again, the only environment variable is the
inverse temperature β, which thus parametrises the operational processes. Given the
above discussion, it should be clear that, for each β, we can write a conditional
probability in the form (70), where the frame function is defined as in Eq. (69), with
probabilities provided by the Gibbs distribution (64). Explicitly,
f
β
(
~
λ) =
e
βH(
~
λ)
Z(β |
~
I)
for
~
λ
~
I, (71)
where Z(β |
~
I) :=
X
~
λ
~
I
e
βH
(
~
λ
)
. (72)
Let us stress that, from the perspective of our framework, we might as well stop here: we
already have a model that is both instrument and process non-contextual. The point of
our theorem is to see if it is possible to write a given operational model in terms of an
underlying non-contextual model; if that is possible, the operational model cannot
reproduce the predictions of quantum mechanics. In this case we already have a
non-contextual model, so we know it cannot reproduce quantum mechanics. Note that
the theorem does not rely on any interpretation we might assign to ontic processes,
events etc; it is simply a statement about properties that an ontological model can or
cannot have.
For the sake of completeness, and since we would more naturally associate ontology with
determinism, we can write explicitly how a deterministic process model looks in the
present case study. Recall that a deterministic process is a (multi-valued) function
~ω (ω
1
. . . ω
N
) from the instruments to the events. For notational convenience, we can
identify the two possible instruments I
j
x
j
at each site j with their label x
j
{0, 1}.
Therefore, a choice of instruments is given by N binary variables ~x (x
1
, . . . , x
N
) and a
Accepted in Quantum 2018-05-04, click title to verify 25
process is identified with 2
N
N-tuples {~a
~x
}
~x∈{0,1}
N
, where ~a
~x
:= ~ω(
~
I
~x
). A deterministic
ontological model is thus defined by a conditional probability distribution
p (~ω | β) P
{~a
~x
}
~x∈{0,1}
N
| β
(73)
that reproduces the operational probabilities via ω-mediation:
X
{~a
~x
}
~x∈{0,1}
N
p
~
λ |
~
I, {~a
~x
}
~x∈{0,1}
N
p({~a
~x
}
~x∈{0,1}
N
| β) = p
~
λ |
~
I, β
, (74)
where the sum should be understood as
X
{~a
~x
}
~x∈{0,1}
N
X
a
1
0
I
1
0
X
a
1
1
I
1
1
· · ·
X
a
N
0
I
N
0
X
a
N
1
I
N
1
(75)
and the “ontic” probabilities are given by
P
~
λ |
~
I, {~a
~x
}
~x∈{0,1}
N
=
Y
j∈N
χ
I
j
(λ
j
)
X
~x∈{0,1}
N
δ
~
λ ~a
~x
, (76)
δ
~
λ ~a
~x
:=
Y
j∈N
δ
λ
j
(a
~x
)
j
. (77)
We now show that the conditional probabilities for the ontic process in our thermal
model are given by
P
{~a
~x
}
~x∈{0,1}
N
| β
=
Y
~x∈{0,1}
N
f
β
(~a
~x
), (78)
where the operational frame function f
β
is given by expression (71). To see that the
conditional probabilities (78) provide an ontological model for the original thermal-state
model, one can verify that, by putting together expressions (78) and (76) into Eq. (74),
one indeed obtains the operational probabilities (70). Explicitly,
X
{~a
~x
}
~x∈{0,1}
N
p
~
λ |
~
I, {~a
~x
}
~x∈{0,1}
N
p({~a
~x
}
~x∈{0,1}
N
| β)
=
Y
j∈N
χ
I
j
(λ
j
)
X
{~a
~x
}
~x∈{0,1}
N
X
~x
0
∈{0,1}
N
δ
~
λ ~a
~x
0
Y
~x∈{0,1}
N
f
β
(~a
~x
)
=
Y
j∈N
χ
I
j
(λ
j
)
X
~x
0
∈{0,1}
N
X
~a
~x
0
~
I
~x
0
δ
~
λ ~a
~x
0
X
{~a
~x
}
~x6=~x
0
Y
~x∈{0,1}
N
f
β
(~a
~x
)
=
Y
j∈N
χ
I
j
(λ
j
)
X
~x
0
∈{0,1}
N
X
~a
~x
0
~
I
~x
0
δ
~
λ ~a
~x
0
f
β
(~a
~x
0
)
Y
~x6=~x
0
X
~a
~x
~
I
~x
f
β
(~a
~x
)
=
Y
j∈N
χ
I
j
(λ
j
)f
β
(
~
λ),
where we used the normalisation of the frame function,
P
~
λ
~
I
~x
f
β
(
~
λ) = 1 for every
collection of instruments
~
I
~x
I
1
x
1
, . . . , I
N
x
N
.
Accepted in Quantum 2018-05-04, click title to verify 26