Beyond the Cabello-Severini-Winter framework: Making
sense of contextuality without sharpness of measurements
Ravi Kunjwal
Perimeter Institute for Theoretical Physics,
31 Caroline Street North, Waterloo, Ontario, Canada, N2L 2Y5.
September 4, 2019
We develop a hypergraph-theoretic frame-
work for Spekkens contextuality applied to
Kochen-Specker (KS) type scenarios that goes
beyond the Cabello-Severini-Winter (CSW)
framework. To do this, we add new
hypergraph-theoretic ingredients to the CSW
framework. We then obtain noise-robust non-
contextuality inequalities in this generalized
framework by applying the assumption of
(Spekkens) noncontextuality to both prepara-
tions and measurements. The resulting frame-
work goes beyond the CSW framework in both
senses, conceptual and technical. On the con-
ceptual level: 1) as in any treatment based on
the generalized notion of noncontextuality
`
a la
Spekkens, we relax the assumption of outcome
determinism inherent to the Kochen-Specker
theorem but retain measurement noncontex-
tuality, besides introducing preparation non-
contextuality, 2) we do not require the exclu-
sivity principle that pairwise exclusive mea-
surement events must all be mutually exclu-
sive as a fundamental constraint on mea-
surement events of interest in an experimen-
tal test of contextuality, given that this prop-
erty is not true of general quantum measure-
ments, and 3) as a result, we do not need to
presume that measurement events of interest
are “sharp” (for any definition of sharpness),
where this notion of sharpness is meant to im-
ply the exclusivity principle. On the techni-
cal level, we go beyond the CSW framework
in the following senses: 1) we introduce a
source events hypergraph besides the mea-
surement events hypergraph usually consid-
ered and define a new operational quantity
Corr that appears in our inequalities, 2) we de-
fine a new hypergraph invariant – the weighted
max-predictability that is necessary for our
analysis and appears in our inequalities, and 3)
our noise-robust noncontextuality inequalities
quantify tradeoff relations between three oper-
ational quantities Corr, R, and p
0
only one
of which (namely, R) corresponds to the Bell-
Ravi Kunjwal: rkunjwal@perimeterinstitute.ca
Kochen-Specker functionals appearing in the
CSW framework; when Corr = 1, the inequal-
ities formally reduce to CSW type bounds on
R. Along the way, we also consider in detail
the scope of our framework vis-
`
a-vis the CSW
framework, particularly the role of Specker’s
principle in the CSW framework, i.e., what the
principle means for an operational theory sat-
isfying it and why we don’t impose it in our
framework.
Contents
1 Introduction 2
2 Spekkens framework 5
2.1 Operational theory . . . . . . . . . . . 5
2.2 Ontological model . . . . . . . . . . . 6
2.3 Representation of coarse-graining . . . 7
2.3.1 Coarse-graining of measurements 7
2.3.2 Coarse-graining of preparations 8
2.4 Joint measurability (or compatibility) 9
2.5 Noncontextuality . . . . . . . . . . . . 9
2.6 An example of Spekkens contextuality:
the fair coin flip inequality . . . . . . . 10
2.7 Connection to Bell scenarios . . . . . . 12
3 Hypergraph approach to Kochen-
Specker scenarios in the Spekkens
framework 13
3.1 Measurements . . . . . . . . . . . . . . 13
3.1.1 Classification of probabilistic
models . . . . . . . . . . . . . . 14
3.1.2 Distinguishing two conse-
quences of Specker’s principle:
Structural Specker’s principle
vs. Statistical Specker’s principle 15
3.1.3 What does it mean for an oper-
ational theory to satisfy struc-
tural/statistical Specker’s prin-
ciple? . . . . . . . . . . . . . . 16
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 1
arXiv:1709.01098v4 [quant-ph] 3 Sep 2019
3.1.4 Remark on the classification of
probabilistic models: why we
haven’t defined “quantum mod-
els” as those obtained from pro-
jective measurements . . . . . . 20
3.1.5 Scope of this framework . . . . 20
3.2 Sources . . . . . . . . . . . . . . . . . 21
4 A key hypergraph invariant: the
weighted max-predictability 23
5 Noise-robust noncontextuality inequali-
ties 24
5.1 Key notions from CSW . . . . . . . . 24
5.2 Key notion not from CSW:
source-measurement correlation, Corr 25
5.3 Obtaining the noise-robust noncontex-
tuality inequalities . . . . . . . . . . . 25
5.3.1 Expressing operational quanti-
ties in ontological terms . . . . 25
5.3.2 Derivation of the noncontextual
tradeoff for any graph G . . . . 26
5.3.3 When is the noncontextual
tradeoff violated? . . . . . . . . 27
5.4 Example: KCBS scenario . . . . . . . 27
6 Discussion 30
6.1 Measurement-measurement cor-
relations vs. source-measurement
correlations . . . . . . . . . . . . . . . 30
6.2 Can our noise-robust noncontextuality
inequalities be saturated by a noncon-
textual ontological model? . . . . . . . 30
6.2.1 The special case of facet-
defining Bell-KS inequalities:
Corr=1 . . . . . . . . . . . . . 30
6.2.2 The general case: Corr < 1 . . 30
6.3 Can trivial POVMs ever violate these
noncontextuality inequalities? . . . . . 31
6.3.1 The case p C
G
) . . . . . . 31
6.3.2 The case p
ConvHull(G
G
)|
ind
) . . . . . . 31
6.3.3 The general case p G
G
) . . 31
7 Conclusions 32
Acknowledgments 33
A Status of KS-contextuality as an experi-
mentally testable notion of nonclassical-
ity for POVMs in quantum theory 33
A.1 Limitations of KS-contextuality vis-`a-
vis POVMs . . . . . . . . . . . . . . . 34
A.1.1 KS-contextuality for POVMs in
the literature . . . . . . . . . . 34
A.1.2 Classifying probabilistic mod-
els: restriction of quantum
models to PVMs . . . . . . . . 35
A.2 Robustness of Bell nonlocality vis-`a-vis
POVMs . . . . . . . . . . . . . . . . . 36
B Ontological models without respecting
coarse-graining relations 37
B.1 How to construct a “KS-
noncontextual” ontological model
of the KCBS experiment [47] without
coarse-graining relations . . . . . . . . 37
B.2 How to construct a “preparation and
measurement noncontextual” ontologi-
cal model without coarse-graining rela-
tions . . . . . . . . . . . . . . . . . . . 37
C Trivial POVMs 38
C.1 Bell-CHSH scenario . . . . . . . . . . 38
C.2 CHSH-type contextuality scenario: 4-
cycle . . . . . . . . . . . . . . . . . . . 38
D The KS-uncolourable hypergraph Γ
18
40
References 41
1 Introduction
To say that quantum theory is counterintuitive, or
that it requires a revision of our classical intuitions,
requires us to be mathematically precise in our def-
inition of these classical intuitions. Once we have a
precise formulation of such classicality, we can begin
to investigate those features of quantum theory that
power its nonclassicality, i.e., its departure from our
classical intuitions, and thus prove theorems about
such nonclassicality. To the extent that a physical
theory is provisional, likely to be replaced by a better
theory in the future, it also makes sense to articu-
late such notions of classicality in as operational a
manner as possible. By ‘operational’, we refer to a
formulation of the theory that takes the operations
preparations, measurements, transformations that
can be carried out in an experiment as primitives and
which specifies the manner in which these operations
combine to produce the data in the experiment. Such
an operational formulation often suggests generaliza-
tions of the theory that can then be used to better
understand its axiomatics [13]. At the same time,
an operational formulation also lets us articulate our
notions of nonclassicality in a manner that is experi-
mentally testable and thus allows us to leverage this
nonclassicality in applications of the theory. Indeed,
a key area of research in quantum foundations and
quantum information is the development of methods
to assess nonclassicality in an experiment under min-
imal assumptions on the operational theory describ-
ing it. The paradigmatic example of this is the case of
Bell’s theorem and Bell experiments [411], where any
operational theory that is non-signalling between the
different spacelike separated wings of the experiment
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 2
is allowed. The notion of classicality at play in Bell’s
theorem is the assumption of local causality: any non-
signalling theory that violates the assumption of local
causality is said to exhibit nonclassicality by the lights
of Bell’s theorem.
More recently, much work [1217] has been devoted
to obtaining constraints on operational statistics that
follow from a generalized notion of noncontextuality
proposed by Spekkens [18]. This notion of classicality
[18] has its roots in the Kochen-Specker (KS) theo-
rem [19], a no-go theorem that rules out the possibility
that a deterministic underlying ontological model [20]
could reproduce the operational statistics of (projec-
tive) quantum measurements in a manner that sat-
isfies the assumption of KS-noncontextuality. KS-
noncontextuality is the notion of classicality at play
in the Kochen-Specker theorem. The Spekkens frame-
work abandons the assumption of outcome determin-
ism [18] the idea that the ontic state of a system
fixes the outcome of any measurement deterministi-
cally that is intrinsic to KS-noncontextuality. It also
applies to general operational theories and extends
the notion of noncontextuality to general experimen-
tal procedures preparations, transformations, and
measurements rather than measurements alone.
Parallel to work along the lines of Spekkens [18],
work seeking to directly operationalize the Kochen-
Specker theorem (rather than revising the notion
of noncontextuality at play) culminated in two re-
cent approaches that classify theories by the de-
gree to which they violate the assumption of KS-
noncontextuality: the graph-theoretic framework of
Cabello, Severini, and Winter (CSW) [21, 22], where a
general approach to obtaining graph-theoretic bounds
on linear Bell-KS functionals was proposed, and the
related hypergraph framework of Ac´ın, Fritz, Lev-
errier, and Sainz (AFLS) [23], where an approach
to characterizing sets of correlations was proposed.
The CSW framework relates well-known graph in-
variants to: 1) upper bounds on Bell-KS inequali-
ties that follow from KS-noncontextuality, 2) upper
bounds on maximum quantum violations of these in-
equalities that can be obtained from projective mea-
surements, and 3) upper bounds on their violation
in general probabilistic theories [24] denoted E1
which satisfy the “exclusivity principle” [22]. Com-
plementary to this, the AFLS framework uses graph
invariants in the service of deciding whether a given
assignment of probabilities to measurement outcomes
in a KS-contextuality experiment belongs to a partic-
ular set of correlations; they showed that membership
in the quantum set of correlations (defined only for
projective measurements in quantum theory) cannot
be witnessed by a graph invariant, cf. Theorem 5.1.3
of Ref. [23]. Another recent approach due to Abram-
sky and Brandenburger [25] employs sheaf-theoretic
ideas to formulate KS-contextuality.
A key achievement of the frameworks of Refs. [22,
23, 25] is a formal unification of Bell scenarios
with KS-contextuality scenarios, treating them on
the same footing. Indeed, the perspective there is
to consider Bell scenarios as a special case of KS-
contextuality scenarios. What is lost in this math-
ematical unification, however, is the fact that Bell-
locality and KS-noncontextuality have physically dis-
tinct, if related, motivations. The physical situation
that Bell’s theorem refers to requires (at least) two
spacelike separated labs (where local measurements
are carried out) so that the assumption of local causal-
ity (or Bell-locality) can be applied.
1
On the other
hand, the physical situation that the Kochen-Specker
theorem refers to does not require spacelike separa-
tion as a necessary ingredient and one can there-
fore consider experiments in a single lab. However,
the assumption of KS-noncontextuality entails out-
come determinism [18], something not required by
local causality in Bell scenarios.
2
This difference
in the physical situation for the two kinds of ex-
periments is one of the reasons for generalizing KS-
noncontextuality to the notion of noncontextuality in
the Spekkens framework [18] (so that outcome deter-
minism is not assumed) while leaving Bell’s notion of
local causality untouched.
In the present paper we build a bridge from the
CSW approach, where KS-noncontextual correlations
are bounded by Bell-KS inequalities, to noise-robust
noncontextuality inequalities in the Spekkens frame-
work [18]. That is, we show how the constraints from
KS-noncontextuality in the framework of Ref. [22]
translate to constraints from generalized noncontex-
tuality in the framework of Ref. [18]. The resulting
operational criteria for contextuality `a la Spekkens
1
What do we mean by whether an assumption “can be ap-
plied”? Of course, mathematically, one can “apply” any as-
sumption one wants in the service of proving a theorem. But
insofar as the mathematics here is trying to model a real exper-
iment, the consistency of those assumptions with some essen-
tial facts of the experiment is the minimal requirement for any
no-go theorem derived from such assumptions to be physically
interesting. Hence, in the presence of signalling (implying the
absence of spacelike separation), it makes no sense to assume
local causality in a Bell experiment and derive the resulting
Bell inequalities: such an assumption on the ontological model
is already in conflict with the fact of signalling across the labs
and no Bell inequalities are needed to witness this fact. Bell in-
equalities only become physically interesting when the theories
being compared relative to them are all non-signalling: if the
experiment itself is signalling, any non-signalling description
locally causal, quantum, or in a general probabilistic theory
(GPT) is ipso facto ruled out.
2
Note that this assumption of outcome determinism doesn’t
affect the conclusions in a Bell scenario even if one adopted
it because of Fine’s theorem [26]: a locally deterministic on-
tological model entails the same set of (Bell-local) correlations
as a locally causal ontological model. Relaxing outcome deter-
minism, however, doesn’t mean the same thing for the kinds
of experiments envisaged by the Kochen-Specker theorem in
particular, it doesn’t mean that models satisfying factorizabil-
ity `a la Ref. [25] are the most general outcome-indeterministic
models – and thus considerations parallel to Fine’s theorem [26]
do not apply, cf. [27, 28].
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 3
are noise-robust and therefore applicable to arbi-
trary positive operator-valued measures (POVMs)
and mixed states in quantum theory. Note that the
insights gleaned from frameworks such as those of
Refs. [22, 23, 25] regarding Bell nonlocality require
no revision in our approach. It is only in the appli-
cation of such frameworks (in particular, the CSW
framework) to the question of contextuality that we
seek to propose an alternative hypergraph framework
(formalizing Spekkens contextuality [18]) that is more
operationally motivated for experimental situations
where one cannot appeal to spacelike separation to
justify locality of the measurements.
3
For Kochen-
Specker type experimental scenarios, we will con-
sider the twin notions of preparation noncontextu-
ality and measurement noncontextuality taken to-
gether as a notion of classicality to obtain noise-
robust noncontextuality inequalities that generalize
the KS-noncontextuality inequalities of CSW. These
inequalities witness nonclassicality even when quan-
tum correlations arising from arbitrary (i.e., possibly
nonprojective) quantum measurements on any quan-
tum state are allowed. A key innovation of this ap-
proach is that it treats all measurements in an oper-
ational theory on an equal footing. No definition of
“sharpness” [2931] is needed to justify or derive non-
contextuality inequalities in this approach. Further-
more, if certain idealizations are presumed about the
operational statistics, then these inequalities formally
recover the usual Bell-KS inequalities `a la CSW. The
Bell-KS inequalities can be viewed as an instance of
the classical marginal problem [2527, 32, 33], i.e.,
as constraints on the (marginal) probability distri-
butions over subsets of a set of observables that fol-
low from requiring the existence of global joint prob-
ability distribution over the set of all observables.
Since the Bell-KS inequalities are only recovered un-
der certain idealizations, but not otherwise, the noise-
robust noncontextuality inequalities we obtain can-
not in general be viewed as arising from a classical
marginal problem. Hence, they cannot be understood
within existing frameworks that rely on this (reduc-
tion to the classical marginal problem) property to
formally unify the treatment of Bell-nonlocality and
KS-contextuality [22, 23, 25]. This is a crucial dis-
tinction relative to the usual Bell-KS inequality type
witnesses of KS-contextuality.
This paper is based on a previous contribution [16]
that laid the conceptual groundwork for the progress
we make here. Besides the noise-robust noncontex-
tuality inequalities that generalize constraints from
KS-noncontextuality in the CSW framework leverag-
ing the graph invariants of CSW [22] (cf. Section 5),
3
Nor the sharpness of the measurements to justify outcome
determinism. We discuss these issues in detail in particular,
the physical basis of KS-noncontextuality vis-`a-vis Bell-locality
and how that influences our framework – in Appendix A for the
interested reader.
the contributions of this paper also include:
An exposition of Specker’s principle and how dif-
ferent implications of it (e.g., consistent exclu-
sivity [23]) for a given operational theory arise in
the hypergraph framework (cf. Sections 3.1.2 and
3.1.3), in particular the results in Theorems 1, 2,
and Corollary 1.
Introduction of a hypergraph invariant the
weighted max-predictability that is key to
our noise-robust noncontextuality inequalities,
cf. Section 4. This invariant is also key to the
hypergraph framework of Ref. [34] which is com-
plementary to the present framework.
A detailed discussion of how KS-
noncontextuality for POVMs has been previously
treated in the literature and the limitations of
those treatments, cf. Appendices A and C.
Also, unlike for the case of KS-noncontextuality
inequalities, we show that trivial POVMs can
never violate our noise-robust noncontextuality
inequalities, cf. Section 6.3.
A discussion of coarse-graining relations in Sec-
tion 2.3 and their importance for contextual-
ity no-go theorems, in particular a discussion of
ontological models that do not respect coarse-
graining relations in Appendix B. We show,
in Appendix B, how relaxing the constraint
from coarse-graining relations on an ontological
model renders either notion of noncontextuality
whether Kochen-Specker [19] or Spekkens [18]
vacuous.
A discussion, by example, of why our generaliza-
tion of the CSW framework cannot accommodate
contextuality scenarios that are KS-uncolourable
in Appendix D and why one needs a distinct
framework, i.e., the framework of Ref. [34], to
treat KS-uncolourable scenarios.
The structure of this paper follows: Section 2 reviews
the Spekkens framework for generalized noncontextu-
ality [18]. Section 3 introduces a hypergraph frame-
work that shares features of traditional frameworks
for KS-contextuality [22, 23] but is also augmented
(relative to these traditional frameworks) with the in-
gredients necessary for obtaining noise-robust noncon-
textuality inequalities. In particular, its subsections
3.1.2 and 3.1.3 discuss Specker’s principle [35] and
define its different implications for contextuality sce-
narios `a la Ref. [23]. Section 4 defines a new hyper-
graph invariant the weighted max-predictability
that we need later on as a crucial new ingredient in
our inequalities. Section 5 obtains noise-robust non-
contextuality inequalities within the framework de-
fined in Section 3 and using the hypergraph invariant
of Section 4 in addition to two graph invariants from
the CSW framework [22]. These inequalities can be
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 4
seen as special cases of the general approach outlined
in Ref. [16]. In Section 6, we include discussions on
various features of our noise-robust noncontextuality
inequalities, in particular the fact that trivial POVMs
can never violate them. Section 7 concludes with some
open questions and directions for future research.
2 Spekkens framework
We concern ourselves with prepare-and-measure ex-
periments. A schematic of such an experiment is
shown in Figure 1 where, for the sake of simplicity, we
imagine a single source device that can perform any
preparation procedure of interest (rather than a col-
lection of source devices, each implementing a partic-
ular preparation procedure) and a single measurement
device that can perform any measurement procedure
of interest (rather than a collection of measurement
devices, each implementing a particular measurement
procedure). Note that this is just a conceptual ab-
straction: in particular, the various possible measure-
ment settings on the measurement device may, for ex-
ample, correspond to incompatible measurement pro-
cedures in quantum theory. The fact that we repre-
sent the different measurement settings by choices of
knob settings M M on a single measurement de-
vice does not mean that it’s physically possible to im-
plement all the measurement procedures represented
by M jointly; it only means that the experimenter
can choose to implement any of the measurements in
the set M in a particular prepare-and-measure exper-
iment. The same is true for our abstraction of prepa-
ration procedures to knob settings (S S) and out-
comes (s V
S
) of a single source device: it’s not that
the same device can physically implement all possible
preparation procedures; it’s just that an experimenter
can choose to implement any procedure in the set S
in a particular prepare-and-measure experiment.
We will consider two levels of description of
prepare-and-measure experiments represented by
Fig. 1: operational and ontological. The operational
description will be specified by an operational theory
that takes source and measurement devices as primi-
tives and describes the experiment solely in terms of
the probabilities associated to their input/output be-
haviour. The ontological description will be specified
by an ontological model that takes the system that
passes between the source and measurement devices
as primitive and describes the experiment in terms of
probabilities associated to properties of this system,
deriving the operational description as a consequence
of coarse-graining over these properties. Let us look
at each description in turn.
2.1 Operational theory
We now describe the various components of Fig. 1 in
more detail. The source device has a source setting
Measurement
Source
Figure 1: A prepare-and-measure experiment.
labelled by S that can be chosen from a set S. The
set S represents, in general, some subset of the set of
all source settings, S , that are admissible in the op-
erational theory, i.e., S S . In a particular prepare-
and-measure experiment, S will typically be a finite
set of source settings. Choosing the setting S pre-
pares a system according to an ensemble of prepara-
tion procedures, denoted {(p(s|S), P
[s|S]
)}
sV
S
, where
{p(s|S)}
sV
S
is a probability distribution over the
preparation procedures {P
[s|S]
}
sV
S
in the ensemble.
This means that the source device has one classical
input S and two outputs: one output is a classical
label s V
S
identifying the preparation procedure
(in the ensemble {(p(s|S), P
[s|S]
)}
sV
S
) that is car-
ried out when source outcome s is observed for source
setting S (this source event is denoted [s|S]), and the
other output is a system prepared according to the
source event [s|S], i.e., preparation procedure P
[s|S]
,
with probability p(s|S). Thus, the assemblage of pos-
sible ensembles that the source device can prepare can
be denoted by {{(p(s|S), P
[s|S]
)}
sV
S
}
SS
.
On the other hand, the measurement device has
two inputs, one a classical input M M specifying
the choice of measurement setting to be implemented,
and the other input receives the system prepared ac-
cording to prepartion procedure P
[s|S]
and on which
this measurement M is carried out. The measurement
device has one classical output m V
M
denoting the
outcome of the measurement M implemented on a
system prepared according to P
[s|S]
, and which occurs
with probability p(m|M, S, s). The set M represents,
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 5
in general, some subset of the set of all measurement
settings, M , that are admissible in the operational
theory, i.e., M M . In a particular prepare-and-
measure experiment, M will typically be a finite set
of measurement settings.
We will be interested in the operational joint prob-
ability p(m, s|M, S) p(m|M, S, s)p(s|S) for this
prepare-and-measure experiment for various choices
of M M, S S. Note how this operational de-
scription takes as primitive the operations carried out
in the lab and restricts itself to specifying the prob-
abilities of classical outcomes (i.e., m, s) given some
interventions (i.e., classical inputs, M, S). So far, we
haven’t assumed any structure on the operational the-
ory describing the schematic of Fig. 1 beyond the fact
that it is a catalogue of input/output probabilities
{{p(m, s|M, S) [0, 1]}
mV
M
,sV
S
}
MM,SS
for various interventions S S and M M that we
will consider in a prepare-and-measure experiment.
We now require more structure in the operational the-
ory underlying this experiment, beyond a mere spec-
ification of these probabilities.
We require that the operational theory admits
equivalence relations that partition experimental pro-
cedures of any type, whether preparations or measure-
ments, into equivalence classes of that type. These
equivalence relations are defined relative to the op-
erational probabilities (not necessarily restricted to a
particular prepare-and-measure experiment) that are
admissible in the theory. We will call these equiv-
alence relations “operational equivalences”, in keep-
ing with standard terminology [18]. This means that
any distinctions of labels between procedures in an
equivalence class of procedures do not affect the oper-
ational probabilities associated with the procedures.
We specify these equivalence relations for measure-
ment and preparation procedures below.
Two measurement events [m|M] and [m
0
|M
0
] are
said to be operationally equivalent, denoted [m|M ] '
[m
0
|M
0
], if there exists no source event in the opera-
tional theory that can distinguish them, i.e.,
p(m, s|M, S) = p(m
0
, s|M
0
, S) [s|S], s V
S
, S S .
(1)
Note that the statistical indistinguishability of [m|M]
and [m
0
|M
0
] must hold for all possible source settings
S in the operational theory, not merely the source
settings S that are of direct interest in a particular
prepare-and-measure experiment. Hence, the “dis-
tinction of labels”, [m|M] or [m
0
|M
0
], is empirically
inconsequential since the two procedures are, in prin-
ciple, indistinguishable by the lights of the operational
theory.
Similarly, two source events [s|S] and [s
0
|S
0
] are said
to be operationally equivalent, denoted [s|S] ' [s
0
|S
0
],
if there exists no measurement event in the opera-
tional theory that can distinguish them, i.e.,
p(m, s|M, S) = p(m, s
0
|M, S
0
),
[m|M], m V
M
, M M . (2)
Again, the statistical indistinguishability of [s|S] and
[s
0
|S
0
] must hold for all possible measurement settings
M , not merely those (i.e., M) that are of direct inter-
est in a particular prepare-and-measure experiment.
Similar to measurement events, the “distinction of la-
bels”, [s|S] or [s
0
|S
0
], is empirically inconsequential
since the two procedures are, in principle, indistin-
guishable by the lights of the operational theory.
Given this equivalence structure for preparation
and measurement procedures in the operational the-
ory, we can now formalize the notion of a context:
Definition 1. A context is any distinction of labels
between operationally equivalent procedures in the op-
erational theory.
To see concrete examples of the kinds of contexts
that will be of interest to us in this paper, con-
sider quantum theory. Any mixed quantum state ad-
mits multiple convex decompositions in terms of other
quantum states, i.e., it can be prepared by coarse-
graining over distinct ensembles of quantum states,
each ensemble denoted by a different label. In this
case, the “distinction of labels” between different de-
compositions denotes a distinction of preparation en-
sembles, which instantiates our notion of a prepara-
tion context. Similarly, a given positive operator can
be implemented by different positive operator-valued
measures (POVMs), and the distinction of labels de-
noting these different POVMs instantiates our notion
of a measurement context.
2.2 Ontological model
Given the operational description of the experiment
in terms of probabilities p(m, s|M, S), we want to
explore the properties of any underlying ontological
model for this operational description. Any such on-
tological model, defined within the ontological mod-
els framework [20], takes as primitive the physical
system (rather than operations on it) that passes
between the source and measurement devices, i.e.,
its basic objects are ontic states of the system, de-
noted λ Λ, that represent intrinsic properties of
the physical system. When a preparation proce-
dure [s|S] is carried out, the source device samples
from the space of ontic states Λ according to a prob-
ability distribution {µ(λ|S, s) [0, 1]}
λΛ
, where
P
λΛ
µ(λ|S, s) = 1, and the joint distribution over
s and λ given S, i.e., {µ(λ, s|S)}
λΛ
, is given by
µ(λ, s|S) µ(λ|S, s)p(s|S). On the other hand, when
a system in ontic state λ is input to the measure-
ment device with measurement setting M M, the
probability distribution over the measurement out-
comes is given by {ξ(m|M, λ) [0, 1]}
mV
M
, where
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 6
P
mV
M
ξ(m|M, λ) = 1. The operational statistics
{{p(m, s|M, S) [0, 1]}
mV
M
,sV
S
}
MM,SS
results from a coarse-graining over λ, i.e.,
p(m, s|M, S) =
X
λΛ
ξ(m|M, λ)µ(λ, s|S), (3)
for all m V
M
, s V
S
, M M, S S.
Note that the definition of an ontological model
above extends to the definition of an ontological model
of the operational theory (as opposed to a particular
fragment of the theory representing the experiment)
when we take M = M and S = S .
2.3 Representation of coarse-graining
We will now specify how coarse-graining of procedures
in a prepare-and-measure experiment is represented
in its description, whether operational or ontological.
Namely, if a procedure is defined as a coarse-graining
of other procedures, then we require that the repre-
sentation of such a procedure is defined by the same
coarse-graining of the representation of the other pro-
cedures.
4
Implicit in this discussion is the assump-
tion that the operational theory allows one to define
new procedures in the set M or S by coarse-graining
other procedures in these sets, i.e., both M and S
are closed under coarse-grainings. In particular, one
can consider coarse-graining measurement and source
settings (belonging to sets M and S, respectively) ac-
tually implemented in the lab to define new measure-
ment and source settings that belong to M \M and
S \S, respectively.
5
2.3.1 Coarse-graining of measurements
Let us see how this works for the case of measurement
procedures: if a measurement procedure M with mea-
surement events {[m|M ]}
mV
M
is defined as a coarse-
graining of another measurement procedure
˜
M with
measurement events {[ ˜m|
˜
M]}
˜mV
˜
M
, symbolically de-
noted by
[m|M]
X
˜m
p(m|˜m)[ ˜m|
˜
M],
where m, ˜m : p(m|˜m) {0, 1},
X
m
p(m|˜m) = 1,
(4)
4
Quantum theory is an example of an operational theory
that satisfies this requirement because of the linearity of the
Born rule with respect to both preparations and measurements.
The same is true, more generally, of general probabilistic theo-
ries (GPTs) [1, 24]. We require this feature in any ontological
model as well, regardless of its (non)contextuality.
5
Similarly, we also allow probabilistic mixtures of (prepa-
ration or measurement) procedures in the operational theory
to define new procedures, i.e., the theory is convex. See the
last paragraph of Section 2.5 for the role of this convexity in
experimental tests of contextuality and Section 2.6 for an ex-
ample where a probabilistic mixture of measurement settings
is required in a proof of contextuality.
then its representation in the operational description
as well as in the ontological description satisfies this
coarse-graining relation.
6
More explicitly, the coarse-
graining relation of Eq. (4) denotes the following post-
processing of
˜
M: for each m V
M
, relabel each
outcome ˜m V
˜
M
to outcome m with probability
p(m|˜m) {0, 1}; the logical disjunction of those ˜m
which are relabelled to m with probability 1 then de-
fines the measurement event [m|M ]. Now, in the op-
erational theory, this post-processing is represented
by
[s|S], where s V
S
, S S :
p(m, s|M, S)
X
˜m
p(m|˜m)p( ˜m, s|
˜
M, S), (5)
and in the ontological model it is represented by
λ Λ : ξ(m|M, λ)
X
˜m
p(m|˜m)ξ( ˜m|
˜
M, λ). (6)
As an example, consider a three-outcome measure-
ment
˜
M with outcomes ˜m {1, 2, 3}, which can
be classically post-processed to obtain a two-outcome
measurement M with outcomes m {0, 1}, such that
p(m = 0|˜m = 1) = p(m = 0|˜m = 2) = 1 and
p(m = 1|˜m = 3) = 1. The measurement events of
M are then just
[m = 0|M ] [ ˜m = 1|
˜
M] + [ ˜m = 2|
˜
M], (7)
[m = 1|M ] [ ˜m = 3|
˜
M], (8)
where the + sign denotes (just as the summation
sign in the definition of [m|M] in Eq. (4) did) logical
disjunction, i.e., measurement event [m = 0|M] is said
to occur when [ ˜m = 1|
˜
M] or [ ˜m = 2|
˜
M] occurs. The
operational and ontological representations of these
measurement events are then given by
[s|S], where s V
S
, S S :
p(m = 0, s|M, S)
2
X
˜m=1
p( ˜m, s|
˜
M, S), (9)
p(m = 1, s|M, S) p( ˜m = 3, s|
˜
M, S), (10)
λ Λ :
ξ(m = 0|M, λ)
2
X
˜m=1
ξ( ˜m|
˜
M, λ), (11)
ξ(m = 1|M, λ) ξ( ˜m = 3|
˜
M, λ). (12)
This requirement on the representation of coarse-
graining of measurements is particularly important
(and often implicit) when the notion of a mea-
surement context is instantiated by compatibility
6
Note that Eq. (4) is not an operational equivalence between
independent procedures. It is a definition of a new procedure
obtained by coarse-graining another procedure.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 7
(or joint measurability), as in the case of KS-
contextuality, where one needs to consider coarse-
grainings of distinct measurements. For example,
consider a measurement setting M
12
with outcomes
(m
1
, m
2
) V
1
× V
2
that is coarse-grained over
m
2
to define an effective measurement setting M
(2)
1
with measurement events {[m
1
|M
(2)
1
]}
m
1
V
1
. Sym-
bolically, [m
1
|M
(2)
1
]
P
m
2
[(m
1
, m
2
)|M
12
], which
is represented in the operational theory as [s|S] :
p(m
1
, s|M
(2)
1
, S)
P
m
2
p((m
1
, m
2
), s|M
12
, S) and
in the ontological model as λ : ξ(m
1
|M
(2)
1
, λ)
P
m
2
ξ((m
1
, m
2
)|M
12
, λ). Similarly, consider another
measurement setting M
13
with outcomes (m
1
, m
3
)
V
1
× V
3
that is coarse-grained over m
3
to de-
fine an effective measurement setting M
(3)
1
with
measurement events {[m
1
|M
(3)
1
]}
m
1
V
1
. Symboli-
cally, [m
1
|M
(3)
1
]
P
m
3
[(m
1
, m
3
)|M
13
], which is
represented in the operational theory as [s|S] :
p(m
1
, s|M
(3)
1
, S)
P
m
3
p((m
1
, m
3
), s|M
13
, S) and
in the ontological model as λ : ξ(m
1
|M
(3)
1
, λ)
P
m
3
ξ((m
1
, m
3
)|M
13
, λ).
Now, imagine that the following oper-
ational equivalence holds at the opera-
tional level: [m
1
|M
(2)
1
] ' [m
1
|M
(3)
1
]. KS-
noncontextuality is then the assumption that
P
m
2
ξ((m
1
, m
2
)|M
12
, λ) =
P
m
3
ξ((m
1
, m
3
)|M
13
, λ)
(i.e., ξ(m
1
|M
(2)
1
, λ) = ξ(m
1
|M
(3)
1
, λ)) for all λ and
that ξ((m
1
, m
2
)|M
12
, λ), ξ((m
1
, m
3
)|M
13
, λ) {0, 1}
for all λ. This assumption applied to multiple
(compatible) subsets of a set of carefully chosen
measurements can then provide a proof of the KS
theorem, i.e., there exist sets of measurements in
quantum theory such that their operational statis-
tics cannot be emulated by a KS-noncontextual
ontological model.
The key point here is this: the requirement that
coarse-graining relations between measurements be
respected by their representations in the ontological
model is independent of the KS-(non)contextuality of
the ontological model.
7
However, this requirement is
necessary for the assumption of KS-noncontextuality
to produce a contradiction with quantum theory; on
the other hand, a KS-contextual ontological model
(while respecting the coarse-graining relations) can
always emulate quantum theory. In this sense, the
representation of coarse-grainings is baked into an on-
tological model from the beginning (just as it is baked
into an operational description), before any claims
about its (non)contextuality.
8
7
In our example, this requirement has to do with the defini-
tions of ξ(m
1
|M
(2)
1
, λ) and ξ(m
1
|M
(3)
1
, λ), not their ontological
equivalence. The ontological equivalence only comes into play
when invoking KS-noncontextuality.
8
One could, of course, choose to not respect the coarse-
graining relations and define a notion of an ontological model
without them. In such a model, one could treat every mea-
2.3.2 Coarse-graining of preparations
Let us now consider the representation of coarse-
grainings for preparation procedures. This works
in a way similar to the case of measurement proce-
dures which we have already outlined. If an ensemble
of source events {[s|S]}
sV
S
is defined as a coarse-
graining of another ensemble, {[˜s|
˜
S]}
˜sV
˜
S
, symboli-
cally denoted as
[s|S]
X
˜s
p(s|˜s)[˜s|
˜
S], where
s, ˜s : p(s|˜s) {0, 1},
X
s
p(s|˜s) = 1, (13)
then its representation should satisfy the same coarse-
graining relation in any description, operational or
ontological. More explicitly, this coarse-graining de-
notes the following post-processing: for any s V
S
,
relabel each outcome ˜s V
˜
S
to outcome s with prob-
ability p(s|˜s) {0, 1}; the logical disjunction of those
˜s which are relabelled to s with probability 1 then de-
fines the source event [s|S]. Now, in the operational
theory, this coarse-graining is represented by
[m|M], where m V
M
, M M :
p(m, s|M, S)
X
˜s
p(s|˜s)p(m, ˜s|M,
˜
S), (14)
and in the ontological model it is represented by
λ Λ : µ(λ, s|S)
X
˜s
p(s|˜s)µ(λ, ˜s|
˜
S). (15)
In this paper, we will focus on a specific type of coarse-
graining: namely, completely coarse-graining over the
outcomes of a source setting, say {[˜s|
˜
S]}
˜sV
˜
S
, to yield
an effective one-outcome source-setting, denoted
˜
S
>
,
associated with a single source event {[>|
˜
S
>
]}, where
[>|
˜
S
>
]
P
˜s
[˜s|
˜
S]. In the operational theory, this
coarse-graining is represented by
[m|M], where m V
M
, M M :
p(m, >|M,
˜
S
>
)
X
˜s
p(m, ˜s|M,
˜
S), (16)
and in the ontological model it is represented by
λ Λ : µ(λ, >|
˜
S
>
)
X
˜s
µ(λ, ˜s|
˜
S). (17)
surement obtained by coarse-graining another (parent) mea-
surement as a fundamentally new measurement with response
functions not respecting the coarse-graining relations with the
parent measurement’s response functions, even if such coarse-
graining relations are respected in the operational description.
Such an ontological model, however, will not be able to ar-
ticulate the ingredients needed for a proof of the KS theorem
and we will not consider it here. Indeed, in the absence of
the requirement that coarse-graining relations be respected in
an ontological model, one can easily construct an ontological
model that is “KS-noncontextual” for any operational theory.
The interested reader may look at Appendix B for more details,
perhaps after looking at Section 2.5 for the relevant definitions
of noncontextuality.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 8
Hence, we use the notation [>|
˜
S
>
] to denote the
source event that “at least one of the source outcomes
in the set V
˜
S
occurs for source setting
˜
S (i.e., the
logical disjunction of ˜s V
˜
S
), formally denoting the
choice of
˜
S and the subsequent coarse-graining over ˜s
by the “source setting”
˜
S
>
and the definite outcome
of this source setting by >”. This source event al-
ways occurs, i.e., p(>|
˜
S
>
) = 1, so p(m, >|M,
˜
S
>
) =
p(m|M,
˜
S
>
, >) and µ(λ, >|
˜
S
>
) = µ(λ|
˜
S
>
, >).
This notion of coarse-graining over all the outcomes
of a source setting allows us to define a notion of
operational equivalence between the source settings
themselves. More precisely, two source settings S and
S
0
are said to be operationally equivalent, denoted
[>|S
>
] ' [>|S
0
>
], if no measurement event can distin-
guish them once all their outcomes are coarse-grained
over, i.e.,
X
sV
S
p(m, s|M, S) =
X
s
0
V
S
0
p(m, s
0
|M, S
0
)
[m|M], m V
M
, M M . (18)
In quantum theory, this would correspond
to the operational equivalence
P
s
p(s|S)ρ
[s|S]
=
P
s
0
p(s
0
|S
0
)ρ
[s
0
|S
0
]
for the density operator obtained
by completely coarse-graining over two distinct en-
sembles of quantum states, {(p(s|S), ρ
[s|S]
)}
sV
S
and
{(p(s
0
|S
0
), ρ
[s
0
|S
0
]
)}
s
0
V
S
0
on some Hilbert space H.
2.4 Joint measurability (or compatibility)
A given measurement procedure, {[m|M]}
mV
M
for
some M M , in the operational description can
be coarse-grained in many different ways to define
new effective measurement procedures. The coarse-
grained measurement procedures thus obtained from
{[m|M]}
mV
M
are then said to be jointly measurable
(or compatible), i.e., they can be jointly implemented
by the same measurement procedure {[m|M ]}
mV
M
which we refer to as their parent or joint measure-
ment. Formally, a set C of measurement procedures
{{[m
i
|M
i
]}
m
i
V
M
i
i {1, 2, 3, . . . , |C|}}
is said to be jointly measurable (or compatible) if it
arises from coarse-grainings of a single measurement
procedure M M , i.e., for all {[m
i
|M
i
]}
m
i
V
M
i
C
[m
i
|M
i
]
X
mV
M
p(m
i
|m)[m|M], (19)
where for all i, m, m
i
: p(m
i
|m) ∈ {0, 1} and
P
m
i
V
M
i
p(m
i
|m) = 1. In terms of the operational
probabilities, this means that
[s|S], s V
S
, S S and ∀{[m
i
|M
i
]}
m
i
V
M
i
C :
p(m
i
, s|M
i
, S)
X
mV
M
p(m
i
|m)p(m, s|M, S). (20)
If, on the other hand, a set of measurement proce-
dures cannot arise from coarse-grainings of any single
measurement procedure, then the measurement pro-
cedures in the set are said to be incompatible, i.e.,
they cannot be jointly implemented.
Note that we will also often refer to a measurement
procedure {[m
i
|M
i
]}
m
i
V
M
i
by just its measurement
setting, M
i
, and thus speak of the (in)compatibility
of measurement settings. Another notion that we will
need to refer to is the joint measurability of measure-
ment events: a set of measurement events that arise
as outcomes of a single measurement setting are said
to be jointly measurable, e.g., all the measurement
events in {[m|M ]}
mV
M
are jointly measurable since
they arise as outcomes of a single measurement set-
ting M.
As a quantum example, consider a commuting pair
of projective measurements, say {Π
1
, I Π
1
} and
{Π
2
, I Π
2
}, where Π
1
and Π
2
are projectors on
some Hilbert space H such that Π
1
Π
2
= Π
2
Π
1
and
I is the identity operator on H. This pair is jointly
implementable since they can be obtained by coarse-
graining the outcomes of the joint projective measure-
ment given by {Π
1
Π
2
, Π
1
(I Π
2
), (I Π
1
2
, (I
Π
1
)(I Π
2
)}.
2.5 Noncontextuality
It is always possible to build an ontological model
reproducing the predictions of any operational the-
ory, while respecting the coarse-graining relations.
9
A trivial example of such an ontological model is one
where ontic states λ are identified with the prepara-
tion procedures P
[s|S]
(where s V
S
and S S )
and we have µ(λ, s|S) δ
λ,λ
[s|S ]
p(s|S), where ontic
state λ
[s|S]
is the one deterministically sampled by the
preparation procedure P
[s|S]
. Further, the response
functions are identified with operational probabili-
ties as ξ(m|M, λ
[s|S]
) p(m|M, S, s). Then we have
P
λΛ
ξ(m|M, λ)µ(λ, s|S) = ξ(m|M, λ
[s|S]
)p(s|S) =
p(m, s|M, S). Also, coarse-graining relations of the
type [ ˜m|
˜
M]
P
m
p( ˜m|m)[m|M] and [˜s|
˜
S]
P
s
p(˜s|s)[s|S] that are respected in the operational
description are also respected in this ontological de-
scription: that is, we have λ Λ : ξ( ˜m|
˜
M, λ)
P
m
p( ˜m|m)ξ(m|M, λ) and λ Λ : µ(λ, ˜s|
˜
S)
P
s
p(˜s|s)µ(λ, s|S).
Hence, it is only when additional assumptions are
imposed on an ontological model that deciding its ex-
istence becomes a nontrivial problem. Such additional
assumptions must, of course, play an explanatory role
to be worth investigating. The assumption we are in-
terested in is noncontextuality, applied to both prepa-
ration and measurement procedures. Motivated by
the methodological principle of the identity of indis-
cernables [18], noncontextuality is an inference from
9
Note that we will always assume coarse-graining relations
are respected in any ontological model. The exception is (some
of) the discussion in Section 2.3 and Appendix B where we
consider the alternative possibility.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 9
the operational description to the ontological descrip-
tion of an experiment. It posits that the equivalence
structure in the operational description is preserved
in the ontological description, i.e., the reason one
cannot distinguish two operationally equivalent rep-
resentations of procedures based on their operational
statistics is that there is, ontologically, no difference
in their representations. We now formally define the
notion of noncontextuality in its generalized form due
to Spekkens [18].
Mathematically, the assumption of measurement
noncontextuality entails that
[m|M] ' [m
0
|M
0
]
ξ(m|M, λ) = ξ(m
0
|M
0
, λ), λ Λ, (21)
while the assumption of preparation noncontextuality
entails that
[s|S] ' [s
0
|S
0
] µ(λ, s|S) = µ(λ, s
0
|S
0
) λ Λ,
[>|S
>
] ' [>|S
0
>
] µ(λ|S) = µ(λ|S
0
) λ Λ.(22)
Here we denote µ(λ|S)
P
sV
S
µ(λ, s|S), etc., for
simplicity of notation, rather than use the notation
µ(λ, >|S
>
), etc., for these coarse-grained probability
distributions. Note that since coarse-grainings are re-
spected in any ontological model we consider, we in-
deed have that µ(λ, >|S
>
)
P
sV
S
µ(λ, s|S).
These are the assumptions of noncontextuality
termed universal noncontextuality – that form the ba-
sis of our approach to noise-robust noncontextuality
inequalities [1217, 45]. Note that the traditional
notion of KS-noncontextuality entails, besides mea-
surement noncontextuality above, the assumption of
outcome-determinism, i.e., for any measurement event
[m|M], ξ(m|M, λ) {0, 1} for all λ Λ.
It is important to note that in order for our no-
tion of operational equivalence to be experimentally
testable, we need that each of sets M and S includes
a tomographically complete set of measurements and
preparations, respectively. That is, the prepare-and-
measure experiment testing contextuality can probe
a tomographically complete set of preparations and
measurements. Of course, the set of all possible mea-
surements in a theory (M ) is (by definition) tomo-
graphically complete for any preparation in the the-
ory and, similarly, the set of all possible preparations
(S ) in a theory is tomographically complete for any
measurement in the theory. However, there may exist
smaller (finite) sets of preparations and measurements
in the theory that are tomographically complete and
in that case we require that S and M include such
tomographically complete sets, even if they don’t in-
clude all possible preparations and measurements in
the theory. For example, when the operational theory
is quantum theory for a qubit, the three spin measure-
ments {σ
x
, σ
y
, σ
z
} are tomographically complete for
any qubit preparation, so we require that M includes
these three measurements even if it doesn’t include ev-
ery other possible measurement on a qubit. While the
requirement that S and M include tomographically
complete sets doesn’t directly reflect in our theoreti-
cal derivation of the noise-robust noncontextuality in-
equalities later, it is crucial for experimentally verify-
ing the operational equivalences (cf. Eqs. (1),(2),(18))
we need to even invoke the assumption of noncontex-
tuality (cf. Eqs. (21),(22)). Further, this assumption
on M and S has so far been necessary to be able to
implement an actual noise-robust contextuality exper-
iment [13], besides the requirement that the opera-
tional theory be convex, i.e., probabilistic mixtures
of procedures in the theory (whether preparations or
measurements) are also valid procedures in the the-
ory. We refer the reader to Refs. [13, 36, 37] for a
discussion of what tomographic completeness entails
for (convex) operational theories formalized as general
probabilistic theories (GPTs). Although we will not
discuss it in this paper, see Ref. [38] for some recent
work towards relaxing the tomographic completeness
requirement for the set of measurement settings.
2.6 An example of Spekkens contextuality: the
fair coin flip inequality
We recap here an example of Spekkens contextual-
ity that has been experimentally demonstrated [13]
to give the reader a flavour of the general approach
we are going to adopt in the rest of this paper with
regard to Kochen-Specker type scenarios. We call the
inequality tested in Ref. [13] the “fair coin flip” in-
equality.
Consider a prepare-and-measure scenario with
three source settings, denoted S {S
1
, S
2
, S
3
}, such
that V
S
i
{0, 1} and we have p(s
i
= 0|S
i
) = p(s
i
=
1|S
i
) = 1/2 for all i {1, 2, 3}. Each S
i
thus cor-
responds to the ensemble of preparation procedures
{(p(s
i
|S
i
), P
[s
i
|S
i
]
)}
s
i
V
S
i
and we have the following
operational equivalence among the source settings af-
ter coarse-graining:
[>|S
1
>
] ' [>|S
2
>
] ' [>|S
3
>
]. (23)
There are four measurement settings in this sce-
nario, denoted M {M
1
, M
2
, M
3
, M
fcf
}, such that
V
M
i
{0, 1} for all i {1, 2, 3, fcf}. The measure-
ment setting M
fcf
is a fair coin flip, i.e., it is in-
sensitive to the preparation procedure preceding it
and yields the outcome m
fcf
= 0 or 1 with equal
probability for any preparation procedure P
[s|S]
, i.e.,
p(m
fcf
= 0|M
fcf
, S, s) = p(m
fcf
= 1|M
fcf
, S, s) = 1/2
for all [s|S].
We also define a measurement procedure M
mix
as a
classical post-processing of M
1
, M
2
, M
3
, i.e., its mea-
surement events {[m
mix
|M
mix
]}
1
m
mix
=0
are defined by
the classical post-processing relation
[m
mix
|M
mix
]
3
X
i=1
p(i)
1
X
m
i
=0
p(m
mix
|m
i
)[m
i
|M
i
],
(24)
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 10
which symbolically denotes the following post-
processing of measurements M
1
, M
2
, M
3
: consider
a uniform probability distribution
p(i) =
1
3
3
i=1
over the measurement settings {M
i
}
3
i=1
and relabel
the respective measurement outcomes, i.e., {m
i
{0, 1}}
3
i=1
, to a measurement outcome m
mix
{0, 1}
according to the probability distributions
{{p(m
mix
|m
i
) = δ
m
mix
,m
i
}
m
mix
∈{0,1}
}
3
i=1
;
coarse-graining over m
i
and i then yields the effective
measurement setting M
mix
with outcomes labelled by
m
mix
{0, 1}. In contrast to the kinds of coarse-
graining (over measurement outcomes) that appear in
KS-noncontextuality (which we discussed in Section
2.3), the (probabilistic) coarse-graining here is over
the measurement settings themselves while retaining
the outcome labels.
10
We require that this coarse-
graining relation be respected in the operational as
well as the ontological description. In the operational
description, this coarse-graining is represented by
[s|S], b {0, 1} :
p(m
mix
= b, s|M
mix
, S)
1
3
3
X
i=1
p(m
i
= b, s|M
i
, S).
(25)
We require the following operational equivalence
between measurement events of M
mix
and M
fcf
with
respect to which we invoke the assumption of mea-
surement noncontextuality:
b {0, 1} : [m
mix
= b|M
mix
] ' [m
fcf
= b|M
fcf
] (26)
If we then look at an operational quantity quanti-
fying source-measurement correlations, namely,
Corr
fcf
3
X
i=1
1
3
X
m
i
,s
i
δ
m
i
,s
i
p(m
i
, s
i
|M
i
, S
i
), (27)
then the assumption of preparation noncontextuality
applied to operational equivalence in Eq. (23) (so that
µ(λ|S
1
) = µ(λ|S
2
) = µ(λ|S
3
) for all λ Λ) and the
assumption of measurement noncontextuality applied
to the operational equivalence in Eq. (26) (so that
1
3
ξ(0|M
1
, λ) +
1
3
ξ(0|M
2
, λ) +
1
3
ξ(0|M
3
, λ) =
1
2
for all
λ Λ) lead to the following constraint:
Corr
fcf
5
6
. (28)
10
We did not discuss these more general types of classical
post-processing in Section 2.3 because they are not relevant to
the treatment of Kochen-Specker type scenarios in the Spekkens
framework. The example we present here is from Ref. [13],
which is not of Kochen-Specker type. The general principle un-
derlying the representation of such classical post-processings is,
however, the same: they should be respected in the operational
as well as the ontological description.
To see how this is obtained, note that
3
X
i=1
1
3
X
m
i
,s
i
δ
m
i
,s
i
p(m
i
, s
i
|M
i
, S
i
)
=
3
X
i=1
1
3
X
m
i
,s
i
δ
m
i
,s
i
X
λΛ
ξ(m
i
|M
i
, λ)µ(λ, s
i
|S
i
)
3
X
i=1
1
3
X
λΛ
max
m
i
ξ(m
i
|M
i
, λ)
X
m
i
,s
i
δ
m
i
,s
i
µ(λ, s
i
|S
i
)
=
3
X
i=1
1
3
X
λΛ
ζ(M
i
, λ)
X
s
i
µ(λ, s
i
|S
i
)
=
X
λΛ
3
X
i=1
1
3
ζ(M
i
, λ)ν(λ), (29)
where we have that ζ(M
i
, λ) max
m
i
ξ(m
i
|M
i
, λ)
and that ν(λ) µ(λ|S
i
) for all i {1, 2, 3}. This
allows us to put the upper bound
Corr
fcf
max
λΛ
1
3
3
X
i=1
ζ(M
i
, λ), (30)
which, subject to the constraint (from measurement
noncontextuality) that
1
3
ξ(0|M
1
, λ) +
1
3
ξ(0|M
2
, λ) +
1
3
ξ(0|M
3
, λ) =
1
2
, yields Eq. (28).
11
It turns out that
in quantum theory the sources and measurements re-
quired for this scenario can be realized on a qubit and
they can, in principle, achieve the value Corr = 1.
This can be achieved by taking the three prepara-
tions to be the trine preparations on an equatorial
plane (say, the Z-X plane) of the Bloch sphere and
the measurements {M
i
}
3
i=1
to be the trine measure-
ments, i.e.,
ρ
[s
i
=0|S
i
]
1
2
(I + ~σ.~n
i
) Π
0
i
,
ρ
[s
i
=1|S
i
]
1
2
(I ~σ.~n
i
) Π
1
i
,
E
[m
i
=0|M
i
]
Π
0
i
,
E
[m
i
=1|M
i
]
Π
1
i
, (31)
where ~n
1
(0, 0, 1), ~n
2
(
3
2
, 0,
1
2
), ~n
3
(
3
2
, 0,
1
2
), and ~σ (σ
x
, σ
y
, σ
z
) denotes the three
Pauli matrices σ
x
=
0 1
1 0
, σ
y
=
0 i
i 0
, and
σ
z
=
1 0
0 1
. The operational equivalences are
then easy to verify:
ρ
[>|S
i
>
]
=
I
2
, i {1, 2, 3},
1
3
3
X
i=1
Π
0
i
=
I
2
. (32)
11
The reader may look at Appendix B.1 of Ref. [13] to con-
vince themselves that the maximum is achieved for an as-
signment of response functions of the type ξ(0|M
1
, λ) = 1,
ξ(0|M
2
, λ) =
1
2
and ξ(0|M
1
, λ) = 0 for some λ.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 11
The quantity Corr
fcf
= 1 from this quantum realiza-
tion. The experimental violation of the noise-robust
noncontextuality inequality, Eq. (28), was demon-
strated in Ref. [13], where more details may be found.
Note that the fair coin flip inequality, Eq.(28), is not
inspired by the kinds of operational equivalences that
are relevant in a proof of the Kochen-Specker theo-
rem, but employs other kinds of operational equiva-
lences allowed in the Spekkens framework [18], i.e.,
the operational equivalences in Eqs. (23) and (26) do
not arise from the same measurement outcome being
shared by different measurements.
Our goal in the present paper is to provide a frame-
work for noise-robust noncontextuality inequalities
obtained from statistical proofs of the KS theorem,
in particular those that are covered by the CSW
framework [22], so that such inequalities can be put
to an experimental test along the lines of Ref. [13]
within the Spekkens framework. Hence, the opera-
tional equivalences between measurement events that
will be of interest to us in this paper are precisely
those which allow for a proof of the KS theorem, i.e.,
those which correspond to the same measurement out-
come (e.g., a projector) being shared by different mea-
surements (e.g., projective measurements).
2.7 Connection to Bell scenarios
As further motivation to study the questions we
are posing, note that one can also view the general
prepare-and-measure scenario we are considering in
this paper (Fig. 1) as arising on one wing of a two-
party Bell experiment: that is, given two parties
Alice and Bob sharing an entangled state and per-
forming local measurements in a Bell experiment, one
can view each choice of measurement setting on Al-
ice’s side as preparing an ensemble of states on Bob’s
side; on account of no-signalling, the reduced state
on Bob’s side will be the same regardless of Alice’s
choice of measurement setting, i.e., all the ensembles
corresponding to Alice’s measurement settings (hence,
Bob’s source settings) will be operationally equiva-
lent.
For example, consider a Bell experiment where
Alice has two choices of measurement settings,
M
A
x
σ
x
or M
A
z
σ
z
, and she shares a Bell
state with Bob: |ψi =
1
2
(|00i + |11i). Bob
has access to some set of measurement settings
M
B
{M
B
j
}
j
on his system. When Alice mea-
sures M
A
x
, she prepares the ensemble of states
S
A
x
{(1/2, ρ
[s
A
x
=0|S
A
x
]
|+ih+|), (1/2, ρ
[s
A
x
=1|S
A
x
]
|−ih−|)} on Bob’s side and when she measures
M
A
z
she prepares the ensemble of states S
A
z
{(1/2, ρ
[s
A
z
=0|S
A
z
]
|0ih0|), (1/2, ρ
[s
A
z
=1|S
A
z
]
|1ih1|)}.
These ensembles are operationally equivalent, yielding
the maximally mixed state on coarse-graining, i.e.,
1
2
|0ih0| +
1
2
|1ih1| =
1
2
|+ih+| +
1
2
|−ih−| =
I
2
. (33)
The quantity of interest in a Bell experiment
p(m
A
i
, m
B
j
|M
A
i
, M
B
j
) (i {x, z}) is then formally the
same as the quantity p(s
A
i
, m
B
j
|S
A
i
, M
B
j
) that we are
interested in our prepare-and-measure scenario. In
the ontological model describing the effective prepare-
and-measure experiment on Bob’s system, we have
the following:
p(s
A
i
, m
B
j
|S
A
i
, M
B
j
)
=
X
λ
Pr(m
B
j
|M
B
j
, λ)Pr(λ, s
A
i
|S
A
i
)
=
X
λ
Pr(m
B
j
|M
B
j
, λ)Pr(s
A
i
|S
A
i
, λ)Pr(λ|S
A
i
). (34)
Assuming preparation noncontextuality relative to
the operational equivalence [>|S
A
x
] ' [>|S
A
z
], we have
Pr(λ|S
A
x
) = Pr(λ|S
A
z
) Pr(λ), so that
p(s
A
i
, m
B
j
|S
A
i
, M
B
j
)
=
X
λ
Pr(s
A
i
|S
A
i
, λ)Pr(m
B
j
|M
B
j
, λ)Pr(λ), (35)
which formally resembles the expression for local
causality when applied to the corresponding two-
party Bell experiment:
p(m
A
i
, m
B
j
|M
A
i
, M
B
j
)
=
X
λ
Pr(m
A
i
|M
A
i
, λ)Pr(m
B
j
|M
B
j
, λ)Pr(λ). (36)
If no other assumption of noncontextuality is in-
voked besides the one applied to the operational
equivalence of source settings on Bob’s system, then
the constraints on p(s
A
i
, m
B
j
|S
A
i
, M
B
j
) will be the same
as the constraints on p(m
A
i
, m
B
j
|M
A
i
, M
B
j
) from Bell
inequalities.
Note, however, that the response functions
Pr(m
B
j
|M
B
j
, λ) and Pr(m
A
j
|M
A
j
, λ) can be completely
arbitrary in a locally causal ontological model for the
Bell experiment and the same applies to the distri-
butions Pr(s
A
i
|S
A
i
, λ) and Pr(m
B
j
|M
B
j
, λ) in a prepa-
ration noncontextual model of the corresponding
prepare-and-measure scenario on Bob’s side. We will
be interested in imposing additional constraints on
the response functions Pr(m
B
j
|M
B
j
, λ) of the prepare-
and-measure scenario (on Bob’s side) that follow from
the assumption of measurement noncontextuality ap-
plied to operational equivalences between measure-
ment events on Bob’s side. In particular, we are inter-
ested in those operational equivalences between mea-
surement events that are required by any statistical
proof of the Kochen-Specker theorem [16, 47]. We
develop this approach more carefully in the following
sections.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 12
3 Hypergraph approach to Kochen-
Specker scenarios in the Spekkens
framework
Having set up the framework needed to articulate
the relevant notions in Section 2, we now proceed to
consider Kochen-Specker type experimental scenarios
in this framework. To do this, we will use the lan-
guage of hypergraphs and their subgraphs to repre-
sent the operational equivalences between measure-
ment events that are required in a Kochen-Specker
argument as well as the operational equivalences be-
tween source settings that we will invoke in our gen-
eralization. The (hyper)graph-theoretic ingredients of
our approach will represent those aspects of the gen-
eral framework of Section 2 that are necessary to go
from the CSW framework for KS-contextuality to a
hypergraph framework for Spekkens contextuality ap-
plied to Kochen-Specker type experimental scenarios.
Our presentation will be a hybrid one, discussing
features of the CSW framework [21, 22] in the nota-
tion of the AFLS framework [23], but extending both
in ways appropriate for the purpose of this paper. Our
goal is to demonstrate how the graph-theoretic invari-
ants of CSW [22] can be repurposed towards obtaining
noise-robust noncontextuality inequalities.
We do this in two parts: first, we define a rep-
resentation of measurement events in the manner of
Refs. [22, 23], and then we define a representation of
source events in the spirit of Ref. [12].
3.1 Measurements
The basic object for representing measurements is a
hypergraph, Γ, with a finite set of vertices V (Γ) such
that each vertex v V (Γ) denotes a measurement
outcome, and a set of hyperedges E(Γ) such that
each hyperedge e E(Γ) is a subset of V (Γ) and
denotes a measurement consisting of outcomes in e.
Here, E(Γ) 2
V (Γ)
and
S
eE(Γ)
e = V (Γ). Such a
hypergraph satisfies the definition of a contextuality
scenario `a la AFLS [23]. We will further assume, un-
less specified otherwise, that the hypergraph is simple:
that is, for all e
1
, e
2
E(Γ), e
1
e
2
e
1
= e
2
, or
that no hyperedge is a strict subset of another. Such
hypergraphs are also called Sperner families [46]. Two
measurement events are said to be (mutually) exclu-
sive if the vertices denoting them appear in a common
hyperedge, i.e., if they can be realized as outcomes of
a single measurement setting.
The structure of a contextuality scenario Γ repre-
sents the operational equivalences between measure-
ment events that are of interest in a Kochen-Specker
argument. We emphasize here that we take the opera-
tional theory to be fundamental and the contextuality
scenario for a particular Kochen-Specker argument to
be derived from (and as a graphical representation of)
the operational equivalences in the operational theory
(cf. Section 2). In particular, depending on the oper-
ational equivalences that an operational theory can
exhibit (by virtue of (in)compatibility relations be-
tween measurements), it may or may not allow some
contextuality scenario to be realized by measurement
events in the theory. The fact that a given vertex,
say v V (Γ), appears in multiple hyperedges, say
E
0
{e E(Γ)|v e}, means that the measurement
events corresponding to this vertex, i.e., {[v|e]}
eE
0
,
are operationally equivalent, and the equivalence class
of these measurement events is denoted by the vertex
v itself. In the case of quantum theory, for example,
v can represent a positive operator that appears in
different positive operator-valued measures (POVMs)
represented by the hyperedges.
A probabilistic model on Γ is an assignment of prob-
abilities to the vertices v V (Γ) such that p(v) 0
for all v V (Γ) and
P
ve
p(v) = 1 for all e E(Γ).
As we have noted, every vertex v represents an equiv-
alence class of measurement events, denoted [m|M],
and every hyperedge e represents an equivalence class
of measurement procedures, denoted M.
12
The fact
that each v represents an equivalence class of mea-
surement events means that
1. any probabilistic model p on Γ, realized by op-
erational probabilities for a given source event
that is, where for all v V (Γ) and a given [s|S],
p(v) p(v|S, s) p(m|M, S, s) is consistent
with the operational equivalences represented by
Γ, and
2. any probabilistic model on Γ, realized by ontolog-
ical probabilities for a given ontic state that is,
where for all v V (Γ) and a given ontic state λ,
p(v) p(v|λ) ξ(m|M, λ) respects (by defini-
tion) the assumption of measurement noncontex-
tuality with respect to the presumed operational
equivalences between measurement events.
We will therefore often write p(m, s|M, S) as
p(v, s|S) and p(m|M , S, s) as p(v|S, s), where [s|S] is
a source event. Similarly, we will also write ξ(m|M, λ)
as p(v|λ), where λ is an ontic state.
Orthogonality graph of Γ, O(Γ): Given the hy-
pergraph Γ, we construct its orthogonality graph
O(Γ): that is, the vertices of O(Γ) are given by
V (O(Γ)) V (Γ), and the edges of O(Γ) are given
by E(O(Γ)) {{v, v
0
}|v, v
0
e for some e E(Γ)}.
12
Note that two measurement procedures with measurement
settings M and M
0
are operationally equivalent if every mea-
surement event of one is operationally equivalent to a distinct
measurement event of the other. That is, there is a bijective
correspondence (of operational equivalence) between the two
sets of measurement events. In quantum theory, for example,
a given POVM (which is what a hyperedge would represent),
say {E
k
}
k
, can be implemented in many possible ways, each
such measurement procedure corresponding to different quan-
tum instrument. Mathematically, these different procedures
can be represented by different sets of operators {O
k
}
k
such
that E
k
= O
k
O
k
for all k and
P
k
O
k
O
k
= I.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 13
Figure 2: The KCBS scenario with 4-outcome joint measure-
ments, visualized as a hypergraph Γ [16, 22, 47].
Each edge of O(Γ) denotes the exclusivity of the two
measurement events it connects, i.e., the fact that
they can occur as outcomes of a single measurement.
For any Bell-KS inequality constraining correla-
tions between measurement events from O(Γ) (when
all measurements are implemented on a given source
event), we construct a subgraph G of O(Γ) such that
the vertices of G, i.e., V (G), correspond to mea-
surement events that appear in the inequality with
nonzero coefficients, and two vertices share an edge
in G if and only if they share an edge in O(Γ). More
explicitly, consider a Bell-KS expression
R([s|S])
X
vV (G)
w
v
p(v|S, s), (37)
where w
v
> 0 for all v V (G). A Bell-KS in-
equality imposes a constraint of the form R([s|S])
R
KS
, where R
KS
is the upper bound on the expres-
sion in any operational theory that admits a KS-
noncontextual ontological model. Often, but not al-
ways, these inequalities are simply of the form where
w
v
= 1 for all v V (G). In keeping with the CSW
notation [22], we will denote the general situation by
a weighted graph (G, w), where w is a function that
maps vertices v V (G) to weights w
v
> 0. See Fig-
ures 2 and 3 for an example from the Klyachko-Can-
Binicio˘glu-Shumovsky (KCBS) scenario [22, 47].
Below, we make some remarks clarifying the scope
of the framework described above before we move to
the case of sources.
3.1.1 Classification of probabilistic models
We classify the probabilistic models on a hypergraph
Γ as follows:
KS-noncontextual probabilistic models, C(Γ): a
probabilistic model which is a convex combina-
tion of deterministic assignments p : V (Γ)
Figure 3: A subgraph of KCBS hypergraph Γ, representing
orthogonality relations of the events of interest in the KCBS
inequality [22, 47].
{0, 1}, where
P
ve
p(v) = 1 for all e E(Γ).
In Ref. [23], this is referred to as a “classical
model”.
13
Note that we call Γ KS-colourable if C(Γ) 6=
and we call it KS-uncolourable if C(Γ) = . Our
terminology here is inspired by the traditional
usage of the term “Kochen-Specker colouring” to
refer to an assignment of two colours to vectors
satisfying some orthogonality relations under the
colouring constraints of the KS theorem [48].
Consistent exclusivity satisfying probabilistic
models, CE
1
(Γ): a probabilistic model on Γ,
p : V (Γ) [0, 1], such that (in addition to sat-
isfying the definition of a probabilistic model),
P
vc
p(v) 1 for all cliques c in the orthogo-
nality graph O(Γ). This is the same as the set of
E1 probabilistic models of Ref. [22].
Note that a clique in the orthogonality graph
O(Γ) is a set of vertices that are pairwise exclu-
sive (i.e., every vertex in this set shares an edge
with every other vertex).
General probabilistic models, G(Γ): Any p that
satisfies the definition of a probabilistic model is
a general probabilistic model, i.e., it can arise
from measurements in some general probabilistic
theory [1] that isn’t necessarily quantum.
The set of all probabilistic models G(Γ) (for any
Γ) forms a polytope since it is defined by just
the positivity and normalization constraints on
the probabilities. The extremal points (or ver-
tices) of this polytope fall into two categories
that will interest us: deterministic and indeter-
ministic. The deterministic extremal points are
13
We use a different term because we are advocating a revi-
sion of the notion of classicality from KS-noncontextuality to
generalized noncontextuality `a la Spekkens.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 14
the p : V (Γ) {0, 1} such that
P
ve
p(v) = 1
for all e E(Γ) and we denote the set of these
points by G(Γ)|
det
. The indeterministic extremal
points are the p G(Γ) which are not determinis-
tic and which, furthermore, cannot be expressed
as a convex mixture of other points in G(Γ).
We denote the set of indeterministic extremal
points by G(Γ)|
ind
. Clearly, G(Γ)|
det
( C(Γ) and
G(Γ)|
ind
G(Γ)\C(Γ).
Overall, we have
C(Γ) CE
1
(Γ) G(Γ) (38)
for any hypergraph Γ.
3.1.2 Distinguishing two consequences of Specker’s
principle: Structural Specker’s principle vs. Statistical
Specker’s principle
The CSW framework [22] restricts the scope of prob-
abilistic models on a hypergraph to those satisfying
consistent exclusivity (the E1 probabilistic models),
motivated by what is sometimes called Specker’s prin-
ciple [35]: that is,
“if you have several questions and you can
answer any two of them, then you can also
answer all of them”
If by “questions” we understand measurement set-
tings, then the principle says that a set of pairwise
jointly implementable measurement settings is itself
jointly implementable. Note that when we say a set
of measurement settings is “jointly implementable”,
“jointly measurable”, or “compatible”, we mean that
there exists another choice of a single measurement
setting in the theory such that this measurement set-
ting can reproduce the statistics of all the measure-
ment settings in the set by coarse-graining.
14
As such,
in its application to measurement settings, Specker’s
principle is a constraint on the measurements allowed
in a physical theory that respects it, e.g., measure-
ment settings that correspond to PVMs (projection
valued measures) in quantum theory. This is, for ex-
ample, the reading adopted in Ref. [49], where the
failure of Specker’s principle in any almost quantum
theory was demonstrated. On the other hand, we will
often also refer to the “joint measurability” of a set of
measurement events, by which we mean that this set
of measurement events is a subset of the set of mea-
surement outcomes for some choice of measurement
setting. At the level of measurement events,
15
then,
there are two distinct ways to read Specker’s principle
14
The reader may recall from Section 2.4 the general defini-
tion of compatibility. Also, see Ref. [44] for an overview of joint
measurability in quantum theory.
15
Recall that a measurement event is a measurement outcome
given a choice of measurement setting, e.g., a projector that
appears in a particular PVM in quantum theory.
that one needs to keep in mind which we distinguish as
structural Specker’s principle vs. statistical Specker’s
principle. We define these two readings below:
Structural Specker’s principle imposes a struc-
tural constraint on a contextuality scenario Γ.
This (strong) reading of Specker’s principle ap-
plies to any set of measurement events, say M
V (Γ), where every pair of measurement events
can arise as outcomes of a single measurement:
that is, for each pair {v, v
0
} M, there exists
some e E(Γ) such that {v, v
0
} e. The princi-
ple then states:
Given a set M of pairwise jointly measurable
measurement events in some contextuality sce-
nario Γ, all the measurement events in M are
jointly measurable, i.e., all the measurement
events in the set can arise as outcomes of a single
measurement: M e for some e E(Γ).
Alternatively, the constraint of structural
Specker’s principle can be restated as:
Every clique in the orthogonality graph of Γ,
O(Γ), is a subset of some hyperedge in Γ.
Note that we haven’t said anything directly
about probabilities here: any Γ satisfying the
above property is said to satisfy structural
Specker’s principle.
Statistical Specker’s principle (or consistent ex-
clusivity) imposes a statistical constraint on prob-
abilistic models on any contextuality scenario Γ
representing measurement events in an opera-
tional theory.
This (weak) reading of Specker’s principle im-
poses an additional constraint on a probabilistic
model p G(Γ) (thus defining CE
1
(Γ) G(Γ)),
namely:
Given a set M of pairwise jointly measurable
measurement events, p satisfies
P
vM
p(v) 1.
This can also be expressed as:
A probabilistic model p G(Γ) is said to satisfy
statistical Specker’s principle if the sum of prob-
abilities it assigns to the vertices of every clique
in the orthogonality graph of Γ, O(Γ), does not
exceed 1, i.e.,
P
vc
p(v) 1 for all cliques c in
O(Γ).
All probabilistic models that satisfy this con-
straint define the set of probabilistic models
CE
1
(Γ) (or E1) for any contextuality scenario
Γ regardless of whether Γ satisfies structural
Specker’s principle. Clearly, CE
1
(Γ) G(Γ).
Any probabilistic model p on Γ such that p
CE
1
(Γ) is said to satisfy statistical Specker’s prin-
ciple or, equivalently, consistent exclusivity [23].
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 15
Probabilistic models on any hypergraph Γ which
satisfies the (strong) structural Specker’s principle ob-
viously satisfy the (weak) statistical Specker’s princi-
ple. This holds simply on account of the structure of
such Γ: that is, for all Γ satisfying structural Specker’s
principle, we have CE
1
(Γ) = G(Γ). To see this, note
that every clique c in O(Γ) is a subset of some hyper-
edge in Γ, hence for every clique c,
P
vc
p(v) 1 for
all p G(Γ), i.e., p CE
1
(Γ).
16
On the other hand,
it remains an open question whether the converse is
true:
That is, given that CE
1
(Γ) = G(Γ) for some Γ, is it
the case that Γ must then necessarily satisfy structural
Specker’s principle, namely, that every clique in O(Γ)
is a subset of some hyperedge in Γ?
A positive answer to this question would answer
Problem 7.2.3 of Ref. [23] asking for a characterization
of Γ for which CE
1
(Γ) = G(Γ).
3.1.3 What does it mean for an operational theory to
satisfy structural/statistical Specker’s principle?
We have so far defined structural Specker’s principle
as a constraint on Γ and statistical Specker’s princi-
ple as a constraint on a probabilistic model on any Γ.
Any operational theory would typically allow many
possible Γ to be realized by its measurement events
as well as many possible probabilistic models to be re-
alized on any Γ representing its measurement events.
Note that when we say that a particular Γ is “re-
alizable” or “allowed” by an operational theory, we
mean that there exist measurement events in the op-
erational theory that satisfy the operational equiva-
lences required by Γ.
17
Further, given such a Γ, the
realizability of a probabilistic model on it by the oper-
ational theory means that there exists a source event
in the operational theory that assigns probabilities to
the measurement events in Γ according to the proba-
bilistic model. It will be useful for our discussion to
define what it means for an operational theory, say T,
to satisfy structural or statistical Specker’s principle.
But before we do that, let us formally specify what it
means for T to satisfy Specker’s principle:
T satisfies Specker’s principle: An operational
theory T is said to satisfy Specker’s principle if, for
any set of measurement settings in T that are pair-
wise jointly implementable, it follows that they are all
jointly implementable in T.
18
16
This partially answers the open Problem 7.2.3 of Ref. [23].
17
Realizability of a particular Γ in an operational theory de-
pends on the (in)compatibility relations that the operational
theory allows between its measurements (cf. Section 2.4). Re-
call that incompatibility of measurements is necessary for KS-
contextuality to be witnessed and the structure of Γ depends
on this incompatibility.
18
Recall from Section 2.4 the definition of joint imple-
mentability (or joint measurability) of some set of measurement
settings.
We denote by T(Γ) the set of probabilistic mod-
els achievable on Γ by an operational theory T, i.e.,
for any p T(Γ), we have that v V (Γ) : p(v) =
p(v|S, s) for some source event [s|S] possible in the op-
erational theory T.
19
Since an operational theory can
only put further constraints on probabilistic models
in G(Γ), we obviously have: T(Γ) G(Γ).
1. T satisfies statistical Specker’s principle:
We say an operational theory T satisfies statisti-
cal Specker’s principle if T(Γ) CE
1
(Γ) G(Γ)
for all Γ.
20
Since the satisfaction of statistical Specker’s prin-
ciple is a constraint on the statistical predictions
of T, there must be some fact about the struc-
ture of theory T that leads to this constraint.
This fact enforcing statistical Specker’s principle
could be some restriction arising from the struc-
ture of allowed measurement events and/or even
the structure of allowed preparations in the oper-
ational theory T. For instance, this is the case for
quantum theory when one only considers projec-
tive measurements implemented on an arbitrary
quantum state, i.e., Q(Γ) CE
1
(Γ) G(Γ),
where Q(Γ) denotes the set of probabilistic mod-
els that can be obtained in this way. More gener-
ally, one could relax the no-restriction hypothe-
sis [3] in some particular way in T so that not all
probabilistic models in G(Γ) are allowed in T(Γ).
In the case of quantum theory, restricting atten-
tion to only projective measurements (as we just
pointed out) rather than the more general case
allowing arbitrary POVMs is one way of restrict-
ing the set of possible probabilistic models re-
alizable with quantum states and measurements
to a strict subset of G(Γ). Allowing arbitrary
POVMs would lead to a violation of statistical
Specker’s principle by probabilistic models aris-
ing from quantum theory.
21
Let us now define what it means for an oper-
ational theory T to satisfy structural Specker’s
principle.
2. T satisfies structural Specker’s principle:
An operational theory T is said to satisfy struc-
tural Specker’s principle if for any set of mea-
surement events that are pairwise jointly measur-
able, i.e, measurement events in each pair arise
19
Note that if the operational theory does not admit mea-
surement events (represented by vertices) exhibiting the opera-
tional equivalences represented by Γ (that is, T does not allow
Γ), then we have that T(Γ) is an empty set.
20
That is, instead of considering only a particular probabilis-
tic model on a particular Γ, we now consider the satisfaction
of statistical Specker’s principle by a whole set of probabilistic
models, namely, T (Γ), for all Γ.
21
See Appendices A (specifically A.1.2) and C for other con-
sequences of allowing arbitrary POVMs, in particular the trivial
‘classical’ ones.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 16
as outcomes of some measurement in the theory,
it is the case that all the measurement events in
the set are jointly measurable, i.e., all the mea-
surement events in the set arise as outcomes of
a single measurement in the theory.
We now show that a theory T that satisfies
Specker’s principle also satisfies structural Specker’s
principle.
Theorem 1. If an operational theory T satisfies
Specker’s principle, then it also satisfies structural
Specker’s principle.
Proof. The argument here relies on the fact that the
operational theory T is such that measurement set-
tings can be coarse-grained to yield new measurement
settings with fewer outcomes. Operationally, this just
corresponds to binning some subsets of outcomes to-
gether in a measurement procedure. The operational
theories we consider in this paper satisfy this prop-
erty, as outlined in Section 2.3 on coarse-graining.
The argument proceeds, for any Γ realizable in T,
by constructing a set of binary-outcome measurement
settings for any given set of pairwise jointly mea-
surable vertices in Γ. These measurement settings
are, by construction, pairwise jointly measurable, so
Specker’s principle applied to them implies that they
are all jointly measurable. This in turn means that the
pairwise jointly measurable vertices in the given set
are also all realizable as outcomes of a single measure-
ment setting. Hence, the theory T satisfies structural
Specker’s principle. We detail the argument below.
Consider a contextuality scenario Γ realizable in
T. To each vertex v V (Γ), we can associate a
measurement setting M
v
with two possible outcomes
labelled {0, 1} such that [1|M
v
] denotes the occur-
rence of v and [0|M
v
] denotes the non-occurrence of
v, i.e., p(v|S, s) = p(1|M
v
, S, s) and 1 p(v|S, s) =
p(0|M
v
, S, s) for any probabilistic model on Γ induced
by some source event [s|S]. The measurement setting
M
v
can be obtained in various (operationally equiva-
lent) ways from the hyperedges that v V (Γ) appears
in: for each hyperedge e E(Γ) such that v e,
we have that the binary-outcome measurement set-
ting consisting of the vertices {v, e\v} where e\v
denotes a coarse-graining over all the measurement
outcomes of e except v is operationally equivalent
to M
v
.
Now, for any pair of vertices {v, v
0
} that appear in a
common hyperedge of Γ, consider the two correspond-
ing measurement settings {M
v
, M
v
0
} such that they
are jointly measurable and their outcomes are mutu-
ally exclusive. The measurement events that can pos-
sibly occur in their joint measurement, denoted M
vv
0
,
are [10|M
vv
0
], [01|M
vv
0
] and [00|M
vv
0
]. The probabil-
ity of [11|M
vv
0
] is always zero, reflecting the fact that
v and v
0
are mutually exclusive. Here, the coarse-
graining relations are: [1|M
v
] [10|M
vv
0
], [1|M
v
0
]
[01|M
vv
0
], [0|M
v
] [00|M
vv
0
] + [01|M
vv
0
], [0|M
v
0
]
[00|M
vv
0
] + [10|M
vv
0
].
The joint measurement M
vv
0
can be constructed
from any hyperedge that v and v
0
appear in: for
any e E(Γ) such that {v, v
0
} e, we have
that [10|M
vv
0
] is a measurement event correspond-
ing to v,
22
[01|M
vv
0
] corresponds to v
0
, [00|M
vv
0
]
corresponds to e\{v, v
0
} (the coarse-graining of all
measurement outcomes in e except v and v
0
), and
[11|M
vv
0
] denotes the null event e. This means
p(10|M
vv
0
, S, s) +p(01|M
vv
0
, S, s) +p(00|M
vv
0
, S, s) =
p(v|S, s) + p(v
0
|S, s) + p(e\{v, v
0
}|S, s) = 1 and
p(11|M
vv
0
, S, s) = 0 for any probabilistic model (in-
duced by some source event [s|S]) on Γ.
Consider now any set of vertices in Γ that is pair-
wise jointly measurable, denoted V
2JM
V (Γ). We
need to show that any such set of vertices V
2JM
is
jointly measurable, i.e., the theory T realizing Γ ad-
mits a single measurement such that all the vertices
in V
2JM
arise as outcomes of this measurement.
Now, the two-outcome measurement settings
{M
v
|v V
2JM
} we have defined are pairwise jointly
measurable and as such, following Specker’s principle,
they should all be jointly measurable in theory T. The
joint measurement corresponding to them can be de-
fined as
M
V
2JM
{[
~
b|M
V
2JM
]
~
b {0, 1}
V
2JM
}, (39)
where each event [
~
b|M
V
2JM
] in the joint measurement
M
V
2JM
represents a particular set of outcomes for mea-
surements in the set {M
v
|v V
2JM
}.
Denoting V
2JM
{v
1
, v
2
, . . . , v
|V
2JM
|
}, we have that
[(10 . . . 0)|M
V
2JM
] [1|M
v
1
],
[(01 . . . 0)|M
V
2JM
] [1|M
v
2
],
.
.
.
[(00 . . . 1)|M
V
2JM
] [1|M
v
|V
2JM
|
],
[(00 . . . 0)|M
V
2JM
]
[0|M
v
1
] + [0|M
v
2
] + ··· + [0|M
v
|V
2JM
|
], (40)
where [0|M
v
1
]+[0|M
v
2
]+···+[0|M
v
|V
2JM
|
] denotes the
measurement event obtained by coarse-graining the
measurement events in {[0|M
v
]|v |V
2JM
|}. All the
other measurement events of M
V
2JM
are null events
that never occur, i.e., they are assigned probability
zero by every source event. Thus, using Specker’s
principle applied to the binary-outcome measurement
settings defined for the vertices in V
2JM
, we have that
the pairwise jointly measurable vertices in V
2JM
are all
jointly measurable, appearing as outcomes of a single
measurement M
V
2JM
.
22
Recall that every vertex v V (Γ) is an equivalence class
of measurement events [v|e] ' [v|e
0
] for all e, e
0
such that v e
and v e
0
.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 17
Having established Theorem 1, we now proceed to
show that a theory which satisfies structural Specker’s
principle also satisfies statistical Specker’s principle.
To do this, we consider a contextuality scenario Γ
which may not satisfy structural Specker’s princi-
ple and from it construct a contextuality scenario Γ
0
which does satisfy the principle. The construction
proceeds as follows:
1. Construct O(Γ).
2. Turn each clique in O(Γ) that is a hyperedge in Γ
to a hyperedge in a new hypergraph Γ
0
. That is,
Γ
0
is such that V (Γ) V
0
) and E(Γ) E
0
).
3. Turn each maximal clique c in O(Γ) that is not a
hyperedge in Γ to a hyperedge in Γ
0
and include
an additional vertex v
c
in this hyperedge. Here,
a maximal clique in a graph is a clique that is
not a strict subset of another clique, i.e., there is
no vertex outside the clique that shares an edge
with each vertex in the clique.
We then have for the hyperedges of Γ
0
,
E
0
) = E(Γ) {c {v
c
}}
cC
, (41)
where C is the set of maximal cliques in O(Γ)
that are not hyperedges in Γ.
Note that as long as a theory T satisfies structural
Specker’s principle, converting maximal cliques
in O(Γ) that are not hyperedges in Γ to hyper-
edges in Γ
0
is a valid move within the theory since
the resulting hyperedge would indeed constitute
a valid measurement in the theory.
If C = (i.e., Γ satisfies structural Specker’s
principle), then we just have E
0
) = E(Γ).
4. The resulting contextuality scenario Γ
0
is thus
given by: V
0
) = V (Γ) {v
c
}
cC
and E
0
) =
E(Γ) {c {v
c
}}
cC
.
If C = we just have V
0
) = V (Γ) and E
0
) =
E(Γ) so that Γ
0
= Γ (i.e., the two hypergraphs
are isomorphic).
Our construction of Γ
0
leads to the following prop-
erties:
Γ
0
satisfies structural Specker’s principle (by con-
struction) since every clique in O
0
) is a subset
of some hyperedge in Γ
0
. Hence, it’s also the case
that statistical Specker’s principle holds for prob-
abilistic models on Γ
0
as CE
1
0
) = G
0
).
Note that the construction of Γ
0
relied on the fact
that the theory we are considering satisfies struc-
tural Specker’s principle. If the theory doesn’t
satisfy this principle, but one goes ahead with
the construction of Γ
0
, then the new hyperedges
in Γ
0
may not constitute valid measurements in
the theory.
Probabilistic models in G
0
) are in bijective cor-
respondence with probabilistic models in CE
1
(Γ):
for any probabilistic model p
Γ
CE
1
(Γ), there
exists a unique probabilistic model p
Γ
0
f (p
Γ
)
G
0
), where the function f is given by p
Γ
0
(v)
f(p
Γ
)(v) = p
Γ
(v) for all v V (Γ) and p
Γ
0
(v
c
)
f(p
Γ
)(v
c
) = 1
P
vc
p
Γ
(v) for all c C.
23
Sim-
ilarly, for any p
Γ
0
G
0
), there exists a unique
probabilistic model p
Γ
g(p
Γ
0
) CE
1
(Γ) given
by p
Γ
(v) g(p
Γ
0
)(v) = p
Γ
0
(v) for all v V (Γ),
i.e., we simply ignore the probabilities assigned
to the vertices v
c
V
0
)\V (Γ) which do not
appear in Γ. Now note that the functions f and
g are inverses of each other: g(f (p
Γ
)) = g(p
Γ
0
) =
p
Γ
and f(g(p
Γ
0
)) = f(p
Γ
) = p
Γ
0
. Hence, there
is a bijective correspondence between G
0
) and
CE
1
(Γ).
Hence, the set of probabilistic models on Γ
that satisfy statistical Specker’s principle, i.e.,
CE
1
(Γ), are in one-to-one correspondence with
the set of probabilistic models on Γ
0
which (by
construction) satisfies structural Specker’s prin-
ciple so that CE
1
0
) = G
0
).
We therefore have that CE
1
(Γ) = C E
1
0
)|
V (Γ)
,
where CE
1
0
)|
V (Γ)
denotes the probabilistic
models induced on Γ by those on Γ
0
(ignoring the
probabilities assigned to vertices in V
0
)\V (Γ)).
It is conceivable that a particular Γ may not ad-
mit probabilistic models from an operational theory
T, i.e., T(Γ) = . On the other hand, if Γ admits a
representation in terms of measurement events admis-
sible in T, so that T(Γ) 6= , then two possibilities
arise: Γ satisfies structural Specker’s principle or it
doesn’t. If Γ satisfies structural Specker’s principle
then any probabilistic model in T(Γ) will satisfy sta-
tistical Specker’s principle and we have Γ
0
= Γ. If
Γ does not satisfy structural Specker’s principle, we
consider its relation with the contextuality scenario
Γ
0
constructed from it that does satisfy structural
Specker’s principle. Such a Γ
0
admits a representa-
tion in a theory T satisfying structural Specker’s prin-
ciple (that is, T
0
) 6= ) as long as Γ admits such
a representation (that is, T(Γ) 6= ). Indeed, it’s
the satisfaction of structural Specker’s principle in T
that renders the construction of Γ
0
from Γ physically
allowed in T.
Thus, in a theory T that satisfies structural
Specker’s principle, the following holds: for every
probabilistic model p
Γ
T(Γ) ( CE
1
(Γ)), Γ
0
ad-
mits a corresponding probabilistic model p
Γ
0
T
0
)
satisfying p
Γ
0
(v) = p
Γ
(v) for all v V (Γ) and
p
Γ
0
(v
c
) = 1
P
vc
p
Γ
(v) for all c C, where C is
the set of maximal cliques in O(Γ) such that none of
them is a hyperedge in Γ. Similarly, given p
Γ
0
T
0
)
23
Recall that {v
c
}
cC
= V
0
)\V (Γ).
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 18
( CE
1
0
)), p
Γ
T(Γ) is uniquely fixed: it’s ob-
tained by just neglecting the probabilities assigned
by p
Γ
0
to the vertices in V
0
)\V (Γ).
We must therefore have T(Γ) = T
0
)
V (Γ)
for any
Γ, where T
0
)
V (Γ)
denotes the set of probabilistic
models induced on Γ by the set of probabilistic models
in T
0
) under the correspondence we have already
established above. We can now state and prove the
following theorem:
Theorem 2. If an operational theory T satisfies
structural Specker’s principle, then it also satisfies
statistical Specker’s principle.
Proof. For any Γ that does not admit a probabilistic
model in T, i.e., T(Γ) = , statistical Specker’s prin-
ciple is trivially satisfied since T(Γ) = CE
1
(Γ)
G(Γ).
For any Γ that does admit a probabilistic model in
T, i.e., T(Γ) 6= , we can have one of two possibili-
ties: either it satisfies structural Specker’s principle,
in which case T(Γ) CE
1
(Γ) = G(Γ), or it doesn’t,
in which case we consider the Γ
0
constructed from it
following the recipe we have already outlined so that
we have:
T
0
)
V (Γ)
G
0
)
V (Γ)
= CE
1
0
)
V (Γ)
=
CE
1
(Γ).
Since T satisfies structural Specker’s principle, we
have T(Γ) = T
0
)
V (Γ)
, which immediately implies
that T(Γ) CE
1
(Γ). That is, the theory T satisfies
statistical Specker’s principle on Γ: T(Γ) CE
1
(Γ)
G(Γ).
Overall, we have the desired result: T satisfies
structural Specker’s principle T(Γ) CE
1
(Γ)
G(Γ) for all Γ, i.e., T satisfies statistical Specker’s
principle.
Thus, one way of enforcing that a particular opera-
tional theory T satisfies statistical Specker’s principle
that is, T(Γ) CE
1
(Γ) G(Γ) for all Γ is
to require that it satisfies structural Specker’s prin-
ciple, a constraint on the structure of measurement
events in T. This is, for example, what is achieved in
Ref. [30] by invoking a notion of “sharpness” for mea-
surement events in an operational theory such that
any set of sharp measurement events that are pairwise
jointly measurable are all jointly measurable. That
is, structural Specker’s principle is satisfied in a the-
ory with such sharp measurement events and, con-
sequently, statistical Specker’s principle, or what is
more conventionally called consistent exclusivity [23],
is also satisfied. But it’s conceivable that there may
be other ways to ensure that only a subset of CE
1
(Γ)
probabilistic models are allowed in T(Γ) for any Γ.
What we wish to emphasize here is that it is by no
means obvious (or at least, it needs to be proven) that
the only way to restrict the set of probabilistic models
T(Γ) to a subset of CE
1
(Γ) for any Γ is to require that
the theory T satisfy structural Specker’s principle.
24
Corollary 1. For any operational theory T, the fol-
lowing implications hold:
T satisfies Specker’s principle
T satisfies structural Specker’s principle (42)
T satisfies statistical Specker’s principle,
i.e., consistent exclusivity. (43)
Proof. This follows from combining Theorems 1 and
2.
Note that statistical Specker’s principle (or consis-
tent exclusivity) is so intrinsic to the CSW approach
[22] that they do not consider probabilistic models
that do not satisfy this principle.
25
This will become
important when we consider the fact that nonprojec-
tive measurements in quantum theory do not satisfy
Specker’s principle, structural or statistical (at the
level of measurement events), and thus also fail to
satisfy the stronger statement of Specker’s principle
for measurement settings (cf. Ref. [49]). Indeed, such
measurements admit contextuality scenarios Γ that
are not possible with projective measurements, such
as the one from three binary-outcome POVMs that
are pairwise jointly measurable but not triplewise so
[3941], and the probabilistic models they give rise to
can only be accommodated in the most general set
of probabilistic models, G(Γ), since trivial POVMs
can realize any probabilistic model. Specker’s prin-
ciple, structural Specker’s principle, and statistical
Specker’s principle were all motivated by the fact that
projective measurements in quantum theory satisfy
them. In particular, consistent exclusivity (or sta-
tistical Specker’s principle) would be obeyed in any
theory where measurement events satisfy structural
Specker’s principle, and indeed, the more recent ap-
proach [29] is to restrict attention to “sharp” mea-
surements in such theories [30, 31], where the def-
inition of “sharp” ensures the property of pairwise
jointly measurable events being globally jointly mea-
surable. This property forms the motivational basis
24
Indeed, any putative theory yielding the set of almost quan-
tum correlations (which satisfy statistical Specker’s principle)
[50] cannot satisfy Specker’s principle — that pairwise joint im-
plementable measurement settings are all jointly implementable
for any notion of sharp measurements [49]. Whether struc-
tural Specker’s principle, which is defined at the level of mea-
surement events, can be upheld for an almost quantum theory
so that it falls in the category of operational theories with
sharp measurements envisaged in Ref. [30] — remains an open
question.
25
As we have already noted, a noise-robust noncontextuality
inequality of the type in Ref. [12] that is based on a logical
proof of the KS theorem is not even obtainable if one restricted
attention to probabilistic models satisfying CE
1
. The upper
bound on that inequality comes from a probabilistic model that
does not satisfy CE
1
.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 19
(and is sufficient) for statistical Specker’s principle to
hold (cf. Theorem 2). That is, this approach [29, 30]
regards statistical Specker’s principle as grounded in
(and physically justified by) structural Specker’s prin-
ciple. Theorem 2 is a precise statement of this in-
tuition in the hypergraph formalism `a la AFLS [23].
The work of Refs. [30, 31] can be understood as bridg-
ing the gap between structural Specker’s principle and
statistical Specker’s principle by formally defining a
notion of sharp measurements in an operational the-
ory such that structural Specker’s principle holds for
these sharp measurements.
On the other hand, and this is the key point for
our purposes, if one wants to make no commitment
about the representation of measurements in the op-
erational theory (in particular, not requiring a notion
of “sharpness”), then Specker’s principle is not a nat-
ural constraint to impose on probabilistic models and,
indeed, one must deal with the full set of probabilistic
models G(Γ) on any contextuality scenario Γ rather
than restrict oneself to the set of probabilistic models
CE
1
(Γ). It is for this reason that we are translating
the notions from CSW [22] to the notational conven-
tions of AFLS [23], the latter being a more natural
choice for our purposes, allowing the language needed
to articulate the difference between CE
1
(Γ) and G(Γ)
rather than excluding the latter by fiat or, perhaps, by
an appeal to structural Specker’s principle holding for
sharp measurements in the landscape of operational
theories under consideration (cf. Theorem 2). It is for
all these reasons that the “exclusivity principle” `a la
CSW [22] is not enough to make sense of Spekkens
contextuality applied to Kochen-Specker type scenar-
ios. The framework we propose in this paper ad-
dresses this gap between the notions Spekkens con-
textuality (which applies to arbitrary measurements)
requires in a hypergraph framework and those that
the CSW framework [22] (which applies to “sharp”
measurements) can provide in its graph-theoretic for-
mulation.
3.1.4 Remark on the classification of probabilistic mod-
els: why we haven’t defined “quantum models” as those
obtained from projective measurements
The reader may note that we haven’t tried to de-
fine any notion of a “quantum model” so far, hav-
ing only adopted the definitions of Ref. [23] for KS-
noncontextual models (C(Γ)), for models satisfying
consistent exclusivity (CE
1
(Γ)), and for general prob-
abilistic models (G(Γ)). The reason for this is that
we do not wish to restrict ourselves to projective mea-
surements in defining a “quantum model”, unlike the
traditional Kochen-Specker approaches [22, 23]. In
Ref. [23], a quantum model is defined as a probabilis-
tic model that can be realized in the following manner:
assign projectors {Π
v
}
vV (Γ)
(defined on any Hilbert
space) to all the vertices of Γ such that
P
ve
Π
v
= I
for all e E(Γ), and we have p(v) = Tr(ρΠ
v
), for
some density operator ρ on the Hilbert space, I being
the identity operator.
On the other hand, allowing arbitrary positive
operator-valued measures (POVMs) in a definition
of a quantum model (as we would rather prefer)
means that, in fact, quantum models on a hyper-
graph Γ are as general as the general probabilistic
models G(Γ), rendering such a definition redundant.
This can be seen by noting that for any probabilistic
model p G(Γ), one can associate positive opera-
tors to the vertices of Γ given by p(v)I such that for
any quantum state ρ on some Hilbert space, we have
p(v) = Tr(ρp(v)I), where I is the identity operator.
Our focus in this paper is not on quantum the-
ory, in particular, even though the need to be able
to handle noisy measurements and preparations (par-
ticularly, trivial POVMs) in quantum theory can be
taken as a motivation for this work. Rather, our focus
is on delineating the boundary between operational
theories that admit noncontextual ontological mod-
els (for Kochen-Specker type experiments, suitably
augmented with multiple preparation procedures, as
outlined in this paper) and those that don’t by ob-
taining noise-robust noncontextuality inequalities. In
particular, we want these inequalities to indicate the
noise thresholds beyond which an experiment cannot
rule out the existence of a noncontextual ontological
model with respect to the quantities of interest. This
also means that making sense of quantum correlations
in this approach requires one to pay attention not only
to the measurements involved in an experiment but
also the preparations; indeed, this shift of focus from
measurements alone, to include multiple preparations
(or source settings), is a fundamental conceptual dif-
ference between our approach and that of traditional
Kochen-Specker contextuality frameworks [22, 23, 25].
3.1.5 Scope of this framework
Note that whenever we refer to the “CSW frame-
work”, we mean the framework of Ref. [22], which
often differs from the framework of Ref. [21] in some
respects, e.g., the normalization of probabilities in a
given hyperedge, assumed in [22], but not in [21]. In
Ref. [21], the authors write:
Notice that in all of the above we never
require that any particular context should be
associated to a complete measurement: the
conditions only make sure that each context
is a subset of outcomes of a measurement
and that they are mutually exclusive. Thus,
unlike the original KS theorem, it is clear
that every context hypergraph Γ has always
a classical noncontextual model, besides pos-
sibly quantum and generalized models.
On the other hand, in Ref. [22], they write:
The fact that the sum of probabilities
of outcomes of a test is 1 can be used to
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 20
Figure 4: The KS-uncolourable hypergraph from Ref. [51]
that is not covered by our generalization of the CSW frame-
work. We denote this hypergraph as Γ
18
.
express these correlations as a positive lin-
ear combination of probabilities of events,
S =
P
i
w
i
P (e
i
), with w
i
> 0.
The latter presentation [22] is more in line with the
“original KS theorem” [19], as well as the presenta-
tion in Ref. [23]. Since normalization of probabili-
ties is thus presumed in Ref. [22], in keeping with the
definition of a probabilistic model we have presented
(following [23]), the graph invariants of CSW [22] re-
fer, specifically, to subgraphs G of those hypergraphs
Γ on which the set of KS-noncontextual probabilistic
models is non-empty. In particular, our generaliza-
tion of the CSW framework [22] in this paper says
nothing about noise-robust noncontextuality inequal-
ities from logical proofs of the Kochen-Specker the-
orem [19], which rely on hypergraphs Γ that admit
no KS-noncontextual probabilistic models, i.e., KS-
uncolourable hypergraphs. It also says nothing for
the hypergraphs Γ that do not satisfy the property
CE
1
(Γ) = G(Γ). An example of such a hypergraph,
which is not covered by our generalization of the CSW
framework on both counts, is the 18 ray hypergraph
first presented in Ref. [51], denoted Γ
18
(see Fig. 4 and
Appendix D). Indeed, the study of noise-robust non-
contextuality inequalities from such KS-uncolourable
hypergraphs was initiated in Ref. [12], and a more ex-
haustive hypergraph-theoretic treatment of it is pre-
sented in Ref. [34]. In this paper, we will restrict
ourselves to KS-colourable hypergraphs, the study
of which was initiated in Ref. [16], and, of these,
only those KS-colourable hypergraphs Γ which satisfy
CE
1
(Γ) = G(Γ). Note that this is not a limitation of
our general approach, which is based on Ref. [16] and
applies to any KS-colourable hypergraph, but rather a
limitation we inherit from the CSW framework [22]
26
26
Ref. [22] takes Specker’s principle to be fundamental and
identifies CE
1
18
) as the most general set of probabilistic mod-
since we want to leverage their graph invariants in
obtaining our noise-robust noncontextuality inequali-
ties. The study of other KS-colourable hypergraphs,
in particular those which arise only with nonprojective
measurements in quantum theory [3941] and are out-
side the scope of traditional frameworks [22, 23, 25],
will be taken up in future work.
To summarize, the measurement events hyper-
graphs Γ where the present framework (and the CSW
framework [22]) applies must satisfy two properties:
C(Γ) 6= (that is, KS-colourability) and CE
1
(Γ) =
G(Γ).
27
In the next subsection, we define additional notions
necessary to obtain noise-robust noncontextuality in-
equalities that make use of graph invariants from the
CSW framework. These notions correspond to source
events that are an integral part of our framework.
3.2 Sources
Having introduced the (hyper)graph-theoretic ele-
ments that we need to talk about measurement
events, we are now in a position to introduce features
of source events that are relevant in the Spekkens
framework. This part of our framework has no prece-
dent in the literature on KS-noncontextuality, in par-
ticular the CSW framework [22]. We introduce these
source events in order to benchmark the measure-
ment events against them, i.e., for every measurement
event, we seek to identify in the operational theory
a corresponding source event that makes this mea-
surement event as likely as possible. This helps us
deal with cases where a measurement device may be
implementing very noisy measurements by explicitly
accounting for this noise in our noise-robust noncon-
textuality inequalities. Further, while we do not as-
sume outcome determinism (which is essential to KS-
noncontextuality), we will invoke preparation noncon-
textuality with respect to these source events in the
Spekkens framework [18]. As an example of what
we mean by “benchmarking” a measurement event
against a source event, consider the case of quantum
els, which is not the case for Γ
18
(for example). See Appendix
D for a detailed discussion of this point.
27
As we have shown, when the operational theory T un-
der consideration satisfies structural Specker’s principle, we
can always turn a hypergraph Γ that doesn’t satisfy structural
Specker’s principle into a hypergraph Γ
0
that satisfies it and for
which, therefore, CE
1
0
) = G
0
) holds. This can be seen as
justification for restricting oneself to probabilistic models sat-
isfying consistent exclusivity in the CSW framework [22]: such
a restriction is not really a restriction if the theory satisfies
structural Specker’s principle. On the other hand, we restrict
ourselves to hypergraphs for which CE
1
(Γ) = G(Γ) without as-
suming that T satisfies structural Specker’s principle. The jus-
tification for this seemingly ad hoc restriction is simply that it
is necessary in order to meaningfully leverage the graph invari-
ants of CSW [22] – in particular, the fractional packing number
in our noise-robust noncontextuality inequalities. This will
become clear when we obtain our noise-robust noncontextuality
inequalities.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 21
theory, where any measurement event represented by
a projector occurs with probability 1 for any source
event that is represented by an eigenstate of this
projector; on the other hand, a positive operator
that isn’t projective cannot occur with a probabil-
ity greater than its largest eigenvalue (< 1) for any
source event. We now proceed to describe the neces-
sary hypergraph-theoretic ingredients we need to ac-
commodate source events in our framework.
As we have argued previously, we require the mea-
surement events hypergraph Γ to be such that C(Γ) 6=
and CE
1
(Γ) = G(Γ) to be able to obtain noise-
robust noncontextuality inequalities that use graph
invariants from the CSW framework [22]. Hence, we
will restrict ourselves to experiments that realize the
operational equivalences represented by this class of
Γ. Now, in the CSW framework [22], every Bell-KS
expression picks out a particular subgraph G of the
orthogonality graph O(Γ) of the contextuality sce-
nario Γ of interest. This amounts to focussing on
a restricted set of probabilities (for the vertices of
G) rather than probabilities for all the measurement
events (represented by vertices of Γ) in the experi-
ment. Hence, the vertices of G denote the measure-
ment events of interest in a given Bell-KS expression
and we have the following:
A general probabilistic model p G(Γ) will
assign probabilities to vertices in G such that:
p(v) 0 for all v V (G) and p(v) + p(v
0
) 1
for every edge {v, v
0
} E(G).
A probabilistic model p CE
1
(Γ) will assign
probabilities to vertices in G such that: p(v) 0
for all v V (G) and
X
vc
p(v) 1, (44)
for every clique c V (G).
A probabilistic model p C(Γ) will assign prob-
abilities to vertices in G such that: p(v) =
P
k
Pr(k)p
k
(v), where Pr(k) 0,
P
k
Pr(k) = 1,
and for each k, p
k
is a deterministic assign-
ment p
k
(v) {0, 1} for all v V (G), and
p
k
(v) + p
k
(v
0
) 1 for every edge {v, v
0
} E(G).
Since Γ is such that CE
1
(Γ) = G(Γ), the condition
X
vc
p(v) 1 for every clique c V (G)
on the probabilities assigned to vertices in G is redun-
dant. We now obtain a simplified hypergraph, Γ
G
,
from G as follows: convert all maximal cliques in G
to hyperedges and add an extra (no-detection) vertex
to each such hyperedge.
28
28
Physically, a “no-detection” vertex denotes the case when
none of the measurement events of interest (here, the events in
G) for a given measurement setting occur.
Figure 5: The hypergraph Γ
G
obtained from G by adding a
no-detection vertex (represented by a hollow circle) to every
maximal clique in G.
This Γ
G
, for any G, will satisfy the prop-
erty that CE
1
G
) = G
G
) and any probabilis-
tic model on Γ assigning probabilities to measure-
ment events in G will correspond to a probabilis-
tic model on Γ
G
which also assigns the same prob-
abilities to measurement events in G. Formally:
V
G
) V (G)
F
{v
c
|c is a maximal clique in G},
and E
G
) {c t {v
c
}|c is a maximal clique in G},
where v
c
is the extra no-detection vertex added to
the hyperedge corresponding to maximal clique c in
G.
We have the following probabilistic model on Γ
G
,
given a probabilistic model p G(Γ): the probabili-
ties assigned to the vertices in V (G) V
G
) are the
same as specified by p G(Γ) and the probabilities as-
signed to the remaining vertices in V
G
)\V (G) are
given by p(v
c
) = 1
P
vc
p(v), for every maximal
clique c in G. Consider, for example, the KCBS sce-
nario [16, 22, 47]: the 20-vertex Γ representing mea-
surement events from five 4-outcome joint measure-
ments (Fig. 2), its 5 vertices G involved in the KCBS
inequality (Fig. 3), and 10-vertex hypergraph Γ
G
con-
structed from G (Fig. 5).
Given Γ
G
, constructed from G, we now require
that the operational theory that realizes measure-
ment events in Γ
G
also admits preparations that can
be represented by a hypergraph Σ
G
of source events
as follows: for every hyperedge e E
G
), corre-
sponding to the choice of measurement setting M
e
,
we define a hyperedge e E
G
) denoting a cor-
responding choice of source setting S
e
. And for
every vertex v e( E
G
)), we define a vertex
v
e
e( E
G
)).
29
Hence, every measurement event
29
Recall from the discussion at the beginning of Section 3.2
that we seek to benchmark the measurement events against
those source events in the operational theory that (ideally)
make them as predictable as possible. The source setting
against which the predictability of a particular measurement
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 22
Figure 6: The source events hypergraph with the operational
equivalences between the source settings separately specified.
[v|e] in Γ
G
corresponds to a vertex v
e
of Σ
G
, and the
number of such vertices in V
G
) is |V
G
)||E
G
)|.
This means that the operational equivalences between
the measurement events that are implicit in Γ
G
such as [v|e] is operationally equivalent to [v|e
0
], where
e, e
0
E
G
) are distinct hyperedges that share the
vertex (representing an equivalence class of measure-
ment events) v V
G
) — are not carried over to the
source events, where none is presumed to be opera-
tionally equivalent to any other, hence v
e
V
G
) is
a different vertex from v
e
0
V
G
). Here v
e
(v
e
0
) rep-
resents a source event [s
e
|S
e
] ([s
e
0
|S
e
0
]), rather than
an equivalence class of source events.
Besides these |V
G
)||E
G
)| vertices in V
G
)
and the associated hyperedges e E
G
), we require
that the operational theory admits an additional hy-
peredge e
E
G
), representing a source setting
S
e
, containing two new vertices v
0
e
, v
1
e
V
G
).
Here v
0
e
represents the source event [s
e
= 0|S
e
]
and v
1
e
represents the source event [s
e
= 1|S
e
].
Hence, we have |V
G
)| = |V
G
)||E
G
)| + 2 and
|E
G
)| = |E
G
)| + 1.
The operational equivalence we do require for Σ
G
(in any operational theory that admits source events
represented by Σ
G
) applies to the source settings: all
source settings, each represented by coarse-graining
the source events in a hyperedge e E
G
), are op-
erationally equivalent, i.e., [>|S
e
>
] ' [>|S
e
0
>
] for all
e, e
0
E
G
), i.e., [m|M] :
P
s
e
p(m, s
e
|M, S
e
) =
P
s
e
0
p(m, s
e
0
|M, S
e
0
), for all e, e
0
E
G
).
An example of such a source events hypergraph was
considered in Ref. [12], albeit without the additional
setting is tested that is the predictability of each measure-
ment event (e.g., v e( E
G
))) for this measurement set-
ting (e.g., M
e
) is benchmarked against some source event (e.g.,
v
e
e( E
G
))) for the source setting (e.g., S
e
) is the
“corresponding choice of source setting S
e
. In Section 5.2 we
will see how these pairs of source and measurement settings
are used to compute an operational quantity relevant for our
noise-robust noncontextuality inequalities.
source labelled by e
here [16]. We illustrate it here
in Fig. 6 for the KCBS scenario.
4 A key hypergraph invariant: the
weighted max-predictability
We now define a hypergraph invariant that will be rel-
evant for our noise-robust noncontextuality inequali-
ties:
β
G
, q) max
pG
G
)|
ind
X
eE
G
)
q
e
ζ(M
e
, p), (45)
where q
e
0 for all e E
G
),
P
eE
G
)
q
e
= 1, and
ζ(M
e
, p) max
ve
p(v)
is the maximum probability assigned to a vertex in e
E
G
) by an extremal indeterministic probabilistic
model p G
G
)|
ind
.
30
We call β
G
, q) the weighted max-predictability of
the measurement settings (i.e., hyperedges) in Γ
G
,
where the hyperedges e E
G
) are weighted accord-
ing to the probability distribution q {q
e
}
eE
G
)
.
We now outline how this quantity is related to prop-
erties of an operational theory T admitting a mea-
surement noncontextual ontological model. Γ
G
repre-
sents a particular configuration of operational equiv-
alences that a set of measurement events in T may
realize. The probabilistic models on Γ
G
that can be
realized by T are, as earlier, denoted by T
G
). Since
T admits a measurement noncontextual ontological
model,
31
its predictions for the specific case of Γ
G
can be reproduced by such a model. But since, in
keeping with the CSW approach [22], we will look at
witnesses of contextuality tailored to particular ex-
periments (Γ
G
representing features of one such ex-
periment), we do not need an ontological model for
the full theory T to reproduce its predictions for a
particular experiment. Indeed, to construct a mea-
surement noncontextual ontological model for the set
of probabilistic models T
G
), it suffices to assume
(without loss of generality) that the extremal proba-
bilistic models on Γ
G
given by G
G
)|
det
tG
G
)|
ind
30
An extremal indeterministic probabilistic model refers to
those extremal p G
G
) for which ζ(M
e
, p) < 1 for some
e E
G
).
31
This will always be the case for any operational theory we
consider: the assumption of measurement noncontextuality on
its own can always be satisfied by a trivial ontological model
of the type we outlined in Section 2.5. Indeed, quantum the-
ory satisfies it, the Beltrametti-Bugajski model [52] that was
discussed in Ref. [18] being an example of a measurement non-
contextual ontological model of quantum theory. It is only
when this assumption is supplemented with something else
outcome determinism in the case of KS-noncontextuality and
preparation noncontextuality in the case of generalized noncon-
textuality [18] that it can produce a contradiction with the
predictions of an operational theory.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 23
– are in bijective correspondence with the ontic states
(Λ) of the physical system on which the measure-
ments are carried out. This is because, firstly, any
probabilistic model in G
G
) can be expressed as a
convex mixture of extremal probabilistic models in
G
G
)|
det
tG
G
)|
ind
, and, secondly, associating each
ontic state in the ontological model with an extremal
probabilistic model
32
in G
G
)|
det
tG
G
)|
ind
means
that any probabilistic model in G
G
) corresponding
to predictions of an operational theory (in particular,
any p T
G
) G
G
)) can be obtained by an ap-
propriate probability distribution over this set of ontic
states. Denoting the set of ontic states corresponding
to G
G
)|
det
by Λ
det
and the set of ontic states cor-
responding to G
G
)|
ind
by Λ
ind
, we have that the
measurement noncontextual ontological model given
by Λ Λ
det
tΛ
ind
reproduces the predictions T
G
)
of any operational theory T that admits a measure-
ment noncontextual ontological model: that is, for
every p G
G
) (and therefore also p T
G
)),
p(v) =
X
λΛ
ξ(v|λ)µ(λ)
for all v V
G
), for some probability distribution
µ : Λ [0, 1] such that
P
λΛ
µ(λ) = 1.
33
We can
also then rewrite β
G
, q) as
β
G
, q) = max
λΛ
ind
X
eE
G
)
q
e
ζ(M
e
, λ), (46)
where ζ(M
e
, λ) max
m
e
ξ(m
e
|M
e
, λ).
5 Noise-robust noncontextuality in-
equalities
We will now proceed to obtain our noise-robust non-
contextuality inequalities following the ideas outlined
in Ref. [16].
5.1 Key notions from CSW
We first recall some key notions from the CSW frame-
work [22] before obtaining our inequalities.
Consider the positive linear combination of the
probabilities of measurement events,
R([s|S])
X
vV (G)
w
v
p(v|S, s), (47)
32
Representing response functions for the ontic state, i.e.,
p(v) = ξ(v|λ), v V
G
)
33
As a corollary, note that as long as the polytope G
G
) has
a finite number of extreme points, we can take the ontic state
space to consist of a finite number of ontic states (as we have
done) without any loss of generality. The hypergraphs Γ
G
we
study representing the measurement events of interest in a
contextuality experiment – have this property because of their
finiteness.
where w
v
> 0 for all v V (G).
The fundamental result of CSW is that this quan-
tity is bounded for different sets of correlations — KS-
noncontextual, those realizable by projective quan-
tum measurements, and those satisfying consistent
exclusivity — by graph-theoretic invariants as follows:
[s|S] : R([s|S])
KS
α(G, w)
Q
θ(G, w)
CE
1
α
(G, w),
(48)
where KS denotes operational theories that admit
KS-noncontextual ontological models and thus realize
probabilistic models on Γ
G
that fall in the set C
G
),
Q denotes quantum theory with projective measure-
ments which assigns probabilistic models on Γ
G
de-
noted by Q
G
), and CE
1
denotes operational theo-
ries satisfying consistent exclusivity and thus realiz-
ing the set of probabilistic models CE
1
G
) on Γ
G
.
The graph invariants of the weighted graph (G, w),
namely, α(G, w), θ(G, w), and α
(G, w) are defined
as follows:
1. Independence number α(G, w):
α(G, w) max
I
X
vI
w
v
, (49)
where I V (G) is an independent set of vertices
of G, i.e., a set of nonadjacent vertices of G, so
that none of the vertices in this set shares an edge
with any other vertex in the set.
2. Lovasz theta number θ(G, w):
θ(G, w) max
{|u
v
i}
vV (G)
,|ψi
X
vV (G)
w
v
|hψ|u
v
i|
2
,
(50)
where {|u
v
i}
vV (G)
= {|u
v
i}
vV (
¯
G)
(each |u
v
i a
unit vector in R
d
) is an orthonormal representa-
tion (OR) of the complement of G, namely,
¯
G,
and the unit vector |ψi R
d
is called a handle.
Here V (
¯
G) V (G) and E(
¯
G) {(v, v
0
)|v, v
0
V (G), (v, v
0
) / E(G)}, and we have in an or-
thonormal representation that hu
v
00
|u
v
000
i = 0 for
all pairs of nonadjacent vertices, (v
00
, v
000
), in
¯
G,
or equivalently, for all (v
00
, v
000
) E(G).
3. Fractional packing number α
(G, w):
α
(G, w) max
{p
v
}
vV (G)
X
vV (G)
w
v
p
v
, (51)
where {p
v
}
vV (G)
is such that p
v
0 for all v
V (G) and
P
vc
p
v
1 for all cliques c in G.
Note that since we are always considering Γ
G
such
that CE
1
G
) = G
G
), we, in fact, have the bounds
[s|S] : R([s|S])
KS
α(G, w)
Q
θ(G, w)
GPT
α
(G, w),
(52)
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 24
where “GPT” denotes the full set of probabilistic
models on Γ
G
, i.e., G
G
).
In terms of the notation we have already intro-
duced, where R([s|S]) R
KS
was a Bell-KS in-
equality, we now have from CSW [22] that
R
KS
= α(G, w).
5.2 Key notion not from CSW:
source-measurement correlation, Corr
We need to define a new quantity not in the CSW
framework, namely,
Corr
X
eE
G
)
q
e
X
m
e
,s
e
δ
m
e
,s
e
p(m
e
, s
e
|M
e
, S
e
), (53)
where {q
e
}
eE
G
)
is a probability distribution, i.e.,
q
e
0 for all e E
G
) and
P
eE
G
)
q
e
= 1,
such that β
G
, q) < 1 holds.
34
In previous work
[12, 16], we have taken q to be the uniform distribution
q
e
=
1
|E
G
)|
, but the derivation of the noncontextual-
ity inequalities is independent of that choice (as we’ll
see here). Also, note that we have chosen the following
labelling convention for outcomes of source setting S
e
(namely, s
e
) and measurement setting M
e
(namely,
m
e
): the source outcomes s
e
for source setting S
e
take values in the same set as measurement outcomes
m
e
for measurement setting M
e
, i.e., V
S
e
= V
M
e
(re-
calling notation from Section 2). In particular, out-
comes corresponding to the measurement event [v|e]
(representing [m
e
|M
e
]) and its corresponding source
event v
e
(representing [s
e
|S
e
]) are both denoted by
the same label, so that m
e
= s
e
for them. An exam-
ple of this from Figs. 5 and 6 would be to, say, denote
the outcomes of a particular e E
G
) (measurement
setting M
e
) by m
e
V
M
e
{0, 1, 2} and correspond-
ing outcomes of e E
G
) (source setting S
e
) by
s
e
V
S
e
{0, 1, 2}; so if [v|e] denotes [m
e
= 0|M
e
],
then v
e
will denote [s
e
= 0|S
e
], etc.
5.3 Obtaining the noise-robust noncontextual-
ity inequalities
5.3.1 Expressing operational quantities in ontological
terms
We begin with expressing the operational quantities of
interest in terms of a noncontextual ontological model.
In an ontological model, R([s|S]) is given by
R([s|S]) =
X
λΛ
X
vV (G)
w
v
p(v|λ)µ(λ|S, s). (54)
Defining R(λ)
P
vV (G)
w
v
p(v|λ), we have that
R([s|S]) =
X
λΛ
R(λ)µ(λ|S, s). (55)
34
Indeed, for the strongest possible constraint on Corr, one
must pick q such that β
G
, q) is minimized.
Similarly, Corr is given by
Corr
=
X
λΛ
X
eE
G
)
q
e
X
m
e
,s
e
δ
m
e
,s
e
ξ(m
e
|M
e
, λ)µ(λ, s
e
|S
e
)
=
X
λΛ
X
eE
G
)
q
e
X
m
e
,s
e
δ
m
e
,s
e
ξ(m
e
|M
e
, λ)µ(s
e
|S
e
, λ)µ(λ|S
e
).
(56)
Here, we have used the fact that
µ(λ, s
e
|S
e
) = µ(s
e
|S
e
, λ)µ(λ|S
e
)
to express Corr in a way that treats sources and mea-
surements similarly.
Using preparation noncontextuality (cf. Eq. (22)),
we have that
e, e
0
E
G
) : [>|S
e
>
] ' [>|S
e
0
>
]
µ(λ|S
e
) = µ(λ|S
e
0
) ν(λ), λ Λ. (57)
Then we can rewrite Corr as
Corr
=
X
λΛ
X
eE
G
)
q
e
X
m
e
,s
e
δ
m
e
,s
e
ξ(m
e
|M
e
, λ)µ(s
e
|S
e
, λ)ν(λ).
(58)
Note that the only λ that contribute to Corr are
those for which ν(λ) > 0. Also, µ(s
e
|S
e
, λ) and
µ(λ|S
e
, s
e
) satisfy the condition µ(s
e
|S
e
, λ)ν(λ) =
µ(λ|S
e
, s
e
)p(s
e
|S
e
), so that µ(s
e
|S
e
, λ) is well-defined
whenever ν(λ) > 0.
Defining
Corr(λ)
X
eE
G
)
q
e
X
m
e
,s
e
δ
m
e
,s
e
ξ(m
e
|M
e
, λ)µ(s
e
|S
e
, λ),
(59)
we have that
Corr =
X
λΛ
Corr(λ)ν(λ), (60)
Recalling that ζ(M
e
, λ) = max
m
e
ξ(m
e
|M
e
, λ),
note that Corr(λ) is upper bounded as follows (for
any λ Λ):
Corr(λ)
X
eE
G
)
q
e
X
m
e
,s
e
δ
m
e
,s
e
ξ(m
e
|M
e
, λ)µ(s
e
|S
e
, λ)
X
eE
G
)
q
e
ζ(M
e
, λ)
X
s
e
µ(s
e
|S
e
, λ)
=
X
eE
G
)
q
e
ζ(M
e
, λ). (61)
If λ Λ
det
, then this upper bound is trivial, i.e.,
Corr(λ) 1, since every measurement has determin-
istic response functions. On the other hand, for all
λ Λ
ind
, we have (from Eq. (46))
Corr(λ) β
G
, q). (62)
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 25
Similarly, for λ Λ
det
we have R(λ) α(G, w), while
for λ Λ
ind
we have R(λ) α
(G, w).
Using the fact that
ν(λ) = µ(λ|S) =
X
s
µ(λ|S, s)p(s|S),
for any S S
e
, e E
G
), we have
Corr
=
X
s
X
λ
Corr(λ)µ(λ|S, s)
!
p(s|S)
=
X
s
Corr
s
p(s|S). (63)
where we have defined Corr
s
P
λ
Corr(λ)µ(λ|S, s).
5.3.2 Derivation of the noncontextual tradeoff for any
graph G
We are now in a position to express our general noise-
robust noncontextuality inequality as a tradeoff be-
tween three operational quantities: Corr, R([s
e
=
0|S
e
]), and p(s
e
= 0|S
e
).
First, note that KS-contextuality is witnessed when
for some choice of [s|S], here given by [s
e
= 0|S
e
],
we have
R([s
e
= 0|S
e
]) > α(G, w).
This means that for some set of ontic states in the
support of [s
e
= 0|S
e
], i.e.,
λ Supp{µ(.|S
e
, s
e
= 0)}
{λ Λ : µ(λ|S
e
, s
e
= 0) > 0}, (64)
we have R(λ) > α(G, w). For such a set of ontic
states one must then have Corr(λ) < 1 (because these
λ Λ
ind
and we have Eq. (62)), which in turn implies
that Corr
s
e
=0
< 1. On the other hand, for s
e
= 1,
we have no constraints: Corr
s
e
=1
1. Thus,
Corr
= Corr
s
e
=0
p(s
e
= 0|S
e
) + Corr
s
e
=1
p(s
e
= 1|S
e
)
p
0
Corr
s
e
=0
+ 1 p
0
, (65)
where p
0
p(s
e
= 0|S
e
).
Defining µ
det
P
λΛ
det
µ(λ|S
e
, s
e
= 0) and
µ
ind
P
λΛ
ind
µ(λ|S
e
, s
e
= 0), we now have
µ
det
+ µ
ind
= 1, (66)
Corr
s
e
=0
µ
det
+ β
G
, q)µ
ind
, (67)
R α(G, w)µ
det
+ α
(G, w)µ
ind
. (68)
Note that assuming µ
det
= 1 would reduce these
constraints to a standard Bell-KS inequality, R
α(G, w). However, since we are not assuming this,
simply eliminating µ
det
and µ
ind
from these con-
straints leads us to
35
Corr
s
e
=0
1 (1 β
G
, q))
R α(G, w)
α
(G, w) α(G, w)
(69)
where the upper bound is nontrivial if and only if
β
G
, q) < 1 and R α(G, w) > 0.
If we are given that β
G
, q) < 1, then we have a
trivial upper bound on Corr
s
e
=0
for the remaining
cases: the upper bound is 1 for R = α(G, w) and
greater than 1 for R < α(G, w).
Thus, our noise-robust noncontextuality inequality
now reads:
Corr 1p
0
(1β
G
, q))
R α(G, w)
α
(G, w) α(G, w)
, (70)
which can be rewritten as
R α(G, w) +
α
(G, w) α(G, w)
p
0
1 Corr
1 β
G
, q)
.
(71)
Note that Eq. (70) expresses the constraint from
noncontextuality as an upper bound on the source-
measurement correlations Corr, reminiscent of the
noise-robust noncontextuality inequality first derived
in Ref. [12] (and later treated in hypergraph-theoretic
terms in Ref. [34]), except here the upper bound
on Corr depends not only on the hypergraph in-
variant β
G
, q) but also two of the graph invari-
ants from the CSW framework [22], namely, α(G, w)
and α
(G, w), besides also the operational quantity
R, which is the figure-of-merit for KS-contextuality
(R > α(G, w) witnesses KS-contextuality) in the
CSW framework. Eq. (70) indicates that the source-
measurement correlations would fail to be perfect
(i.e., Corr < 1) in an operational theory admit-
ting a noncontextual ontological model if and only if
R > α(G, w) and β
G
, q) < 1. Contextuality would
be witnessed when the source-measurement correla-
tions are stronger than the constraint from Eq. (70).
For R α(G, w), in particular, there is no constraint
from noncontextuality on Corr.
On the other hand, rewriting the constraint from
noncontextuality as Eq. (71), one is reminded of the
CSW framework [22], where R is taken to be the quan-
tity that is upper bounded by KS-noncontextuality.
Here, instead, we have that R is upper bounded by
a term that includes the source-measurement correla-
tions Corr that can be achieved for the measurements
and thus penalizes for measurements that cannot be
made highly predictable with respect to some prepa-
rations, i.e., Corr < 1 makes it harder to violate the
35
To see this explicitly, just use Eq. (66) to make the substi-
tution µ
ind
= 1µ
det
in Eqs. (67) and (68), then eliminate µ
det
from Eq. (67) by using the upper bound on it from Eq. (68).
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 26
upper bound on R. When the upper bound reaches
α
(G, w), it becomes trivial and R is no longer con-
strained by noncontextuality on account of noise in
the measurements. Indeed, trivial POVMs (cf. Ap-
pendices A.1.2 and C) never violate such a noncon-
textuality inequality because of the penalty incurred
via Corr, as we later show in Section 6.3.
5.3.3 When is the noncontextual tradeoff violated?
The inequality of Eq. (71) can be rewritten as the
following tradeoff between Corr, p
0
, and R:
Corr+p
0
(1β
G
, q))
R α(G, w)
α
(G, w) α(G, w)
1. (72)
Writing the constraint from noncontextuality in the
form of Eq. (72) (in contrast to Eqs. (70) and (71))
makes it more even-handed in its treatment of the
two operational quantities R (which is key in the
CSW framework [22]) and Corr (which is key in
noise-robust noncontextuality inequalities inspired by
logical proofs of the KS theorem [12, 34]) and em-
phasizes that noise-robust noncontextuality inequal-
ities inspired by statistical proofs of the KS theo-
rem [16] are tradeoffs between R (which is about
the strength of correlations between measurements)
and Corr (which is about the predictability of mea-
surements) that must be satisfied by any operational
theory admitting a noncontextual ontological model.
Roughly speaking, a high degree of predictability for
measurements (e.g., Corr = 1) cannot coexist with
very strong correlations between the measurements
(e.g., R = α
(G, w)) when the operational theory ad-
mits a noncontextual ontological model.
For a nontrivial constraint and hence, the pos-
sibility of witnessing contextuality via violation of
this inequality (Eq. (72)) the upper bound on Corr
(the right-hand-side of Eq. (70)) should be strictly
bounded above by 1, and the upper bound on R
(the right-hand-side of Eq. (71)) should be strictly
bounded above by α
(G, w) (the algebraic upper
bound on R), that is
p
0
> 0 and β
G
, q) < 1,
R > α(G, w),
Corr > 1 p
0
(1 β
G
, q)). (73)
These are the minimal benchmarks necessary be-
sides the requirement of tomographic completeness of
a finite set of procedures and the possibility of in-
ferring secondary procedures with exact operational
equivalences using convexity of the operational theory
[13] to witness contextuality in a Kochen-Specker
type experiment adapted to our framework following
Spekkens [18].
Suppose one achieves, by some means, a value
of R = θ(G, w), the upper bound on the quantum
value with projective measurements. When would
this value be an evidence of contextuality? For this
to be the case, we must have:
Corr > 1p
0
(1β
G
, q))
θ(G, w) α(G, w)
α
(G, w) α(G, w)
. (74)
Now, for the ideal quantum realization where mea-
surement events are projectors, and the corresponding
source events are eigenstates, it is always the case that
Corr = 1, hence contextuality is witnessed. However,
it’s possible to witness contextuality even if Corr < 1,
as long as it exceeds the lower bound specified above.
In a sense, for quantum theory, this allows for a quan-
titative accounting of the effect of nonprojectiveness
in the measurements (or mixedness in preparations)
on the possibility of witnessing contextuality, a fea-
ture that is absent in traditional Kochen-Specker ap-
proaches [2123, 25]. Indeed, as long as one achieves
any value of R > α(G, w), it is possible to witness
contextuality for a sufficiently high value of Corr (see
Eq. (70)).
5.4 Example: KCBS scenario
We will now illustrate our hypergraph framework by
applying it to the KCBS scenario to make differences
with respect to the CSW graph-theoretic framework
[22] explicit.
The graph G for the KCBS scenario is given in
Fig. 3, the measurement events hypergraph Γ
G
is
given in Fig. 5, and the source events hypergraph Σ
G
is given in Fig. 6. We then have
R([s|S]) =
X
vV (G)
p(v|S, s), (75)
where the (vertex) weights w
v
= 1 for all v V (G),
i.e., it’s an unweighted graph and we will use α(G)
and α
(G) to denote its independence number and
the fractional packing number, respectively. These
are given by
α(G) = 2 and α
(G) = 5/2. (76)
The source-measurement correlation term is given by
Corr =
X
eE
G
)
q
e
X
m
e
,s
e
δ
m
e
,s
e
p(m
e
, s
e
|M
e
, S
e
) (77)
for any choice of probability distribution q
{q
e
}
eE
G
)
. For simplicity, we will just take this
probability distribution to be uniform, i.e., q
e
=
1
5
for all e E
G
). Note that the only extremal prob-
abilistic model on Γ
G
corresponding to an indeter-
ministic assignment (in Λ
ind
) assigns ξ(v|λ) =
1
2
for
all v V (G). This means
β
G
, q) =
1
2
q. (78)
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 27
Figure 7: Geometric configuration of the vectors appearing
in the KCBS construction [47].
The noncontextuality inequality of Eq. (71)
R α(G, w) +
α
(G, w) α(G, w)
p
0
1 Corr
1 β
G
, q)
(79)
then becomes (in the KCBS scenario)
R 2 +
1/2
p
0
1 Corr
1/2
, (80)
or
R 2 +
1 Corr
p
0
. (81)
Recall that the KCBS inequality [22, 47] reads R 2
and it would be a valid noncontextuality inequality
in our framework if and only if one can find mea-
surements and preparations such that Corr = 1.
In the standard KCBS construction [47] that vio-
lates the inequality R 2, we have the five ver-
tices in G (say v
i
, i {1, 2, 3, 4, 5}, labelled cycli-
cally) associated with five projectors Π
i
= |l
i
ihl
i
|,
i {1, 2, 3, 4, 5}, on a qutrit Hilbert space, given
by the vectors |l
i
i = (sin θ cos φ
i
, sin θ sin φ
i
, cos θ),
φ
i
=
4πi
5
, and cos θ =
1
4
5
. The special source event
[s
e
= 0|S
e
] is associated with the quantum state
|ψi = (0, 0, 1), so that
R =
5
X
i=1
|hl
i
|ψi|
2
=
5 > 2. (82)
See Fig. 7 for a depiction of the geometric configura-
tion of these vectors.
To turn this KCBS construction into an argument
against noncontextuality in our approach, we need
additional ingredients beyond the graph G. Firstly,
for both the measurement events hypergraph Γ
G
and
the source events hypergraph Σ
G
, we denote the hy-
peredges by e
i
, i {1, 2, 3, 4, 5}. In Γ
G
, the mea-
surement events for the setting M
e
i
are given by
{[m
e
i
= 0|M
e
i
] = |l
i
ihl
i
|, [m
e
i
= 1|M
e
i
] = I |l
i
ihl
i
|
|l
i+1
ihl
i+1
|, [m
e
i
= 2|M
e
i
] = |l
i+1
ihl
i+1
|}, where for
i = 5, i + 1 = 1 (addition modulo 5). Similarly,
in Σ
G
, the source events corresponding to source
setting S
e
i
are given by {[s
e
i
= 0|S
e
i
] = |l
i
ihl
i
|,
[s
e
i
= 2|S
e
i
] = |l
i+1
ihl
i+1
|, and [s
e
i
= 1|S
e
i
] =
I |l
i
ihl
i
||l
i+1
ihl
i+1
|}, where p(s
e
i
= b|S
e
i
) =
1
3
for
all b {0, 1, 2}. The special source setting S
e
con-
sists of source events {[s
e
= 0|S
e
] = |ψihψ|, [s
e
=
1|S
e
] =
I−|ψihψ|
2
}, where p(s
e
= 0|S
e
) =
1
3
and
p(s
e
= 1|S
e
) =
2
3
. We thus have the operational
equivalences we need between the source settings:
[>|S
e
>
] ' [>|S
e
0
>
] =
I
3
, e, e
0
E
G
). (83)
This choice of representation for Γ
G
and Σ
G
yields
p
0
=
1
3
, Corr = 1, and R([s
e
= 0|S
e
]) =
5, so that
the inequality
R 2 +
1 Corr
p
0
(84)
is violated. However, note that this is an idealization
(under which Corr = 1) and, typically, the source
events and measurement events will not be perfectly
correlated (Corr < 1) and the operational equiva-
lences between the source settings need not corre-
spond to the maximally mixed state. All that is re-
quired for a test of noncontextuality using this in-
equality is that the operational equivalences hold for
some choice of preparations and measurements which
need not be the same as that in the ideal KCBS con-
struction.
To illustrate what happens when Corr < 1, we con-
sider the effect of a depolarizing channel on the states
and measurements in the ideal KCBS construction.
The channel is given by
D
r
(·) = rI(·)I + (1 r)
1
3
ITr(·), r [0, 1]. (85)
The action of this channel with parameter r
1
[0, 1], say on the pure states {{|l
i
ihl
i
|}
5
i=1
, |ψihψ|}
yields the noisy states given by
D
r
1
(|l
i
ihl
i
|) = r
1
|l
i
ihl
i
| + (1 r
1
)
I
3
, i [5], (86)
D
r
1
(|ψihψ|) = r
1
|ψihψ| + (1 r
1
)
I
3
, (87)
and the action of its adjoint with parameter r
2
[0, 1], say on the ideal projectors, {|l
i
ihl
i
|}
5
i=1
, in-
volved in the measurements correspondingly yields
the POVM elements given by
D
r
2
(|l
i
ihl
i
|) = r
2
|l
i
ihl
i
| + (1 r
2
)
I
3
, i [5]. (88)
Hence, we are imagining a situation where the prepa-
ration procedures are affected by depolarizing noise
with parameter r
1
and measurement procedures are
affected by depolarizing noise with parameter r
2
, sim-
ilar to the situation considered previously in Section
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 28
II of the Supplemental material of Ref. [12]. The op-
erational equivalences required for our argument from
preparation and measurement noncontextuality are
satisfied by these noisy preparations and measure-
ments. That is, in Γ
G
, we can represent the mea-
surement events for the setting M
e
i
(where i [5],
i + 1 = 1, i.e., addition modulo 5) by
[m
e
i
= 0|M
e
i
] = D
r
2
(|l
i
ihl
i
|), (89)
[m
e
i
= 1|M
e
i
] = D
r
2
(I |l
i
ihl
i
| |l
i+1
ihl
i+1
|), (90)
[m
e
i
= 2|M
e
i
] = D
r
2
(|l
i+1
ihl
i+1
|). (91)
It is easy to verify that these form elements of a valid
POVM denoted by the measurement setting M
e
i
and
that the operational equivalences between the mea-
surement events (represented by Γ
G
) are indeed re-
spected. On the other hand, in Σ
G
, the source events
corresponding to source setting S
e
i
can be represented
by
[s
e
i
= 0|S
e
i
] = D
r
1
(|l
i
ihl
i
|), (92)
[s
e
i
= 1|S
e
i
] = D
r
1
(I |l
i
ihl
i
| |l
i+1
ihl
i+1
|}) (93)
[s
e
i
= 2|S
e
i
] = D
r
1
(|l
i+1
ihl
i+1
|), (94)
where p(s
e
i
= b|S
e
i
) =
1
3
for all b {0, 1, 2}, while the
source events for source setting S
e
can be represented
by
[s
e
= 0|S
e
] = D
r
1
(|ψihψ|), (95)
[s
e
= 1|S
e
] = D
r
1
I |ψihψ|
2
, (96)
where p(s
e
= 0|S
e
) =
1
3
and p(s
e
= 1|S
e
) =
2
3
.
These satisfy the operational equivalences
[>|S
e
>
] ' [>|S
e
0
>
] =
I
3
, e, e
0
E
G
). (97)
We then have
Corr =
1
5
X
eE
G
)
1
3
X
b∈{0,1,2}
p(m
e
= b|M
e
, S
e
, s
e
= b).
(98)
Noting that for any qutrit pure state |φi and its corre-
sponding projector |φihφ|, each affected by depolariz-
ing noise with parameters r
1
and r
2
, respectively, we
have
Tr(D
r
1
(|φihφ|)D
r
2
(|φihφ|))
=
1
3
+
2
3
r
1
r
2
. (99)
Now, each term in the summation defining Corr,
namely, p(m
e
= b|M
e
, S
e
, s
e
= b), is obtained from
a calculation of the type in Eq. (99). Hence, we have
for each such term,
p(m
e
= b|M
e
, S
e
, s
e
= b)
=
1
3
+
2
3
r
1
r
2
, (100)
so that
Corr =
1
3
+
2
3
r
1
r
2
. (101)
In the noiseless regime, i.e., r
1
= r
2
= 1, this reduces
to the ideal KCBS scenario. On the other hand, we
have
R([s
e
= 0|S
e
])
=
X
vV (G)
p(v|S
e
, s
e
= 0)
=
5
X
i=1
Tr
D
r
2
(|l
i
ihl
i
|)D
r
1
(|ψihψ|)
=
5
X
i=1
r
1
r
2
|hl
i
|ψi|
2
+
r
2
(1 r
1
)
3
+
r
1
(1 r
2
)
3
+
(1 r
1
)(1 r
2
)
3
=r
1
r
2
5
X
i=1
|hl
i
|ψi|
2
+
5
3
(1 r
1
r
2
). (102)
Recall that violation of the noncontextuality inequal-
ity requires that
R > 2 +
1 Corr
p
0
. (103)
That is,
r
1
r
2
5
X
i=1
|hl
i
|ψi|
2
+
5
3
(1 r
1
r
2
)
>2 + 3
1
1
3
+
2
3
r
1
r
2

. (104)
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 29
Given that
P
5
i=1
|hl
i
|ψi|
2
=
5, this becomes
r
1
r
2
5 +
5
3
(1 r
1
r
2
)
>2 + 2(1 r
1
r
2
). (105)
Rewriting this, we obtain
r
1
r
2
> 1
5 2
5 +
1
3
0.908, (106)
that is, the noncontextuality inequality can be vio-
lated only when the depolarizing noise is below a cer-
tain threshold given by r
1
r
2
> 0.908. In terms of
Corr, this requires Corr > 0.939. The noiseless case
r
1
= r
2
= 1 takes us back to the Corr = 1 regime that
we previously discussed.
6 Discussion
6.1 Measurement-measurement correlations
vs. source-measurement correlations
Note that the usual Kochen-Specker experiment, as
conceptualized in Refs. [2123, 25], for example, in-
volves only the quantity R([s|S]), representing corre-
lations between various measurement events when all
the measurements are implemented on a system pre-
pared according to the same preparation procedure,
denoted by the source event [s|S]. Thus, R represents
measurement-measurement correlations on a system
prepared according to a fixed choice of preparation
procedure.
On the other hand, the experiment we have concep-
tualized in this paper involves, besides the quantity
R, a quantity Corr representing source-measurement
correlations, characterizing the quality of the mea-
surements in terms of their response to corresponding
preparations.
Our noncontextuality inequalities represent a
trade-off relation that must hold between R and Corr
in an operational theory that admits a noncontextual
ontological model. Here we note that the first exam-
ple of such a tradeoff relation, albeit only for the case
of operational quantum theory with unsharp measure-
ments, appeared in Ref. [39] as the Liang-Spekkens-
Wiseman (LSW) inequality [40] which has been shown
to be experimentally violated in Ref. [53].
36
And, in-
deed, the developments reported in Ref. [16] and the
present paper have their origins in the idea of such a
trade-off relation that first appeared in Ref. [39].
36
This experiment, however, is not in a position to make
claims about contextuality without presuming the operational
theory is quantum theory simply because the LSW inequality
presumes operational quantum theory. The noncontextuality
inequalities in this paper do not require the operational the-
ory to be quantum theory and can therefore be experimentally
tested using techniques from Refs. [13, 37, 54].
6.2 Can our noise-robust noncontextuality in-
equalities be saturated by a noncontextual on-
tological model?
A natural question concerns the tightness of these
noncontextuality inequalities, i.e., can Eq. (72) be sat-
urated by a noncontextual ontological model? This
requires one to specify a noncontextual ontological
model reproducing the operational equivalences be-
tween the measurement events and between the source
settings, such that
Corr + p
0
(1 β
G
, q))
R α(G, w)
α
(G, w) α(G, w)
= 1.
(107)
The assumption of measurement noncontextuality
is already implicit in our characterization of the re-
sponse functions ξ(m
e
|M
e
, λ), and for this reason it
is, indeed, trivial to satisfy measurement noncontex-
tuality while saturating these noncontextuality in-
equalities. Measurement noncontextuality, alone, in
fact even allows a violation of the inequality (when
no preparation noncontextuality is imposed), the ex-
treme case being R = α
(G, w) and 1 Corr >
1 p
0
(1 β
G
, q)). It’s the assumption of prepara-
tion noncontextuality that is nontrivial to satisfy and
we do not know if there exists a general construction
of a noncontextual ontological model saturating our
noncontextuality inequalities. We outline the general
situation below.
6.2.1 The special case of facet-defining Bell-KS in-
equalities: Corr=1
If outcome determinism is presumed (as in traditional
Bell-KS type treatments), then we know that there
exists a necessary and sufficient set of Bell-KS in-
equalities (each corresponding to a particular choice
of R([s|S])) that are satisfied by any operational the-
ory admitting a KS-noncontextual ontological model.
In particular, each such (facet) Bell-KS inequality can
be saturated by KS-noncontextual ontological models
that yield probabilities (from G
G
)) corresponding
to the facet-defining Bell-KS inequality, i.e., which
satisfy R([s|S]) = α(G, w) for such a Bell-KS in-
equality. Indeed, our noise-robust noncontextuality
inequalities corresponding to these choices of R([s|S])
(i.e., facet-defining Bell-KS inequalities of the Bell-KS
polytope which is given by the convex hull of points in
G
G
)|
det
) can always be saturated when Corr = 1,
because in that case outcome determinism is justified
by preparation noncontextuality (cf. Ref. [16]) and
our inequalities are identical to the Bell-KS inequali-
ties (saturated by R = α(G, w)).
6.2.2 The general case: Corr < 1
Since we do not want to assume outcome determin-
ism, nor necessarily the idealization of Corr = 1,
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 30
what is at stake here is the assumption of prepa-
ration noncontextuality. This assumption must be
satisfied while saturating the noise-robust noncontex-
tuality inequality in order for a measurement non-
contextual ontological model to be universally non-
contextual. Constructing such a noncontextual onto-
logical model amounts to specifying the distributions
µ(s
e
|S
e
, λ) and ν(λ) such that
λ Λ : µ(λ|S
e
) = ν(λ), e E
G
), (108)
i.e., preparation noncontextuality holds, and we have
(rewriting the saturation condition from Eq. (107))
(α
(G, w) α(G, w))Corr + p
0
(1 β
G
, q))R
= (α
(G, w) α(G, w)) + p
0
α(G, w)(1 β
G
, q)),
(109)
where
Corr =
X
s
e
p(s
e
|S
e
)Corr
s
e
, (110)
Corr
s
e
=
X
λΛ
Corr(λ)µ(λ|S
e
, s
e
), (111)
Corr(λ)
X
eE
G
)
q
e
X
m
e
,s
e
δ
m
e
,s
e
ξ(m
e
|M
e
, λ)µ(s
e
|S
e
, λ),
(112)
and
R =
X
λΛ
R(λ)µ(λ|S
e
, s
e
= 0). (113)
Unfortunately, we do not have a general construction
that can show this to be possible for any noise-robust
noncontextuality inequality obtained according to the
approach we have outlined. We therefore leave it as
an open question whether such an inequality can (al-
ways?) be saturated by a noncontextual ontological
model.
6.3 Can trivial POVMs ever violate these non-
contextuality inequalities?
No.
Recall that a trivial POVM is defined as an assign-
ment of positive operators p(v)I to the vertices of Γ
G
,
where I is the identity operator on some Hilbert space
and p : V
G
) [0, 1], such that
P
ve
p(v) = 1 for
all e E
G
), is a probabilistic model on Γ
G
.
6.3.1 The case p C
G
)
Consider trivial POVMs corresponding to any KS-
noncontextual probabilistic model, i..e., p C
G
) is
a convex mixture of deterministic vertices, G
G
)|
det
,
or equivalently, of ontic states in Λ
det
. In other
words, C
G
) ConvHull(G
G
)|
det
), the convex
hull of points in G
G
)|
det
. The largest value Corr
can take in this case is less than or equal to 1. This
means that the upper bound on R from our noncon-
textuality inequality, Eq. (71), will be greater than
or equal to α(G, w), whereas we know that for a
KS-noncontextual probabilistic model, R α(G, w).
Hence, there is no violation of our noncontextuality
inequality for such trivial POVMs.
6.3.2 The case p ConvHull(G
G
)|
ind
)
Now consider trivial POVMs that correspond to the
indeterministic vertices, G
G
)
ind
(correspondingly,
Λ
ind
), or their convex mixtures. We know that for
these trivial POVMs, Corr β
G
, q). For any
R α
(G, w) that is achieved by these trival POVMs,
our noncontextuality inequality reads
Corr 1 p
0
(1 β
G
, q))
R α(G, w)
α
(G, w) α(G, w)
,
(114)
A sufficient condition for this inequality to be satisfied
is that
β
G
, q) 1 p
0
(1 β
G
, q))
R α(G, w)
α
(G, w) α(G, w)
,
(115)
which reduces, for R > α(G, w), to
p
0
α
(G, w) α(G, w)
R α(G, w)
, (116)
where the upper bound is greater than or equal to
1, since α(G, w) < R α
(G, w). This is trivially
satisfied since p
0
1.
For R < α(G, w), the sufficient condition of
Eq. (115) is again trivially satisfied since it reduces
to
p
0
α
(G, w) α(G, w)
α(G, w) R
, (117)
and we must anyway have p
0
0.
For R = α(G, w), the sufficient condition reduces
to β
G
, q) 1, which is again trivially satisfied since
β
G
, q) < 1 by definition.
6.3.3 The general case p G
G
)
In general, a probabilistic model achieved by trivial
POVMs can be in the convex hull of both determinis-
tic (Λ
det
) and indeterministic (Λ
ind
) ontic states, with
the total weight on deterministic ontic states denoted
by Pr(Λ
det
) and that on indeterministic ontic states
by Pr(Λ
ind
), so that Pr(Λ
det
)+Pr(Λ
ind
) = 1. We then
have
Corr Pr(Λ
det
) + Pr(Λ
ind
)β
G
, q),
R Pr(Λ
det
)α(G, w) + Pr(Λ
ind
)α
(G, w).
(118)
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 31
A sufficient condition for satisfaction of the noncon-
textuality inequality is then
1 Pr(Λ
ind
)(1 β
G
, q))
1 p
0
(1 β
G
, q))
R α(G, w)
α
(G, w) α(G, w)
,
(119)
which becomes
p
0
α
(G, w) α(G, w)
R α(G, w)
Pr(Λ
ind
) (120)
when R > α(G, w). Noting that
R α(G, w) + Pr(Λ
ind
)(α
(G, w) α(G, w)),
we have
Pr(Λ
ind
)
R α(G, w)
α
(G, w) α(G, w)
, (121)
so that the sufficient condition for satisfaction of the
noncontextuality inequality becomes p
0
1, which is
trivially satisfied.
When R = α(G, w), the sufficient condition be-
comes β
G
, q) 1, which is again trivially satisfied.
Finally, when R < α(G, w), the sufficient condition
becomes
p
0
α
(G, w) α(G, w)
α(G, w) R
Pr(Λ
ind
), (122)
which is again trivially satisfied since p
0
0.
Hence trivial POVMs cannot yield a violation of
our noncontextuality inequalities. This is the sense in
which trivial POVMs cannot lead to nonclassicality in
our approach, unlike the case of traditional Kochen-
Specker approaches [2123, 25] applied to the case
of POVMs [73]. To violate our noncontextuality in-
equalities, the POVMs must necessarily have some
nontrivial projective component (that is not the iden-
tity operator or zero) but they need not be projec-
tors. Further, we do not rely on restricting the notion
of joint measurability [44] (cf. Section 2.4) to com-
mutativity for POVMs. Taking joint measurability
to be just commutativity is the approach adopted in,
for example, Ref. [25]. We refer to Appendix A and
Appendix C for more discussion on these issues, in
particular Appendix C for the role of commutativity
vs. joint measurability.
7 Conclusions
We have obtained a hypergraph framework for ob-
taining noise-robust noncontextuality inequalities cor-
responding to KS-colourable scenarios, suitably aug-
mented with preparation procedures in the spirit of
Spekkens contextuality [18]. The inequalities take the
form of a noncontextual tradeoff between the three
operational quantities Corr, R, and p
0
, cf. Eq. (72).
This framework leverages the graph invariants from
the graph-theoretic framework of CSW for doing this,
in addition to a new hypergraph invariant (Eq. (45))
that we call the weighted max-predictability. Our ap-
proach is general enough to be applicable to any situ-
ation involving noisy preparations and measurements
that arises from a KS-colourable contextuality sce-
nario.
We conclude with a list of open questions raised in
this paper and other directions for future research:
1. Characterizing structural Specker’s principle
from probabilistic models on a hypergraph Γ:
Given that CE
1
(Γ) = G(Γ) for some Γ, is it
the case that Γ must then necessarily satisfy
structural Specker’s principle, namely, that ev-
ery clique in O(Γ) is a subset of some hyperedge
in Γ? Or is it the case that there exists a hyper-
graph Γ
0
for which CE
1
0
) = G
0
) but struc-
tural Specker’s principle fails?
More generally, is there any characterization of a
hypergraph satisfying structural Specker’s prin-
ciple entirely in terms of the probabilistic models
on it?
As already pointed out earlier, this open question
relates to the open Problem 7.2.3 of Ref. [23] of
characterizing Γ for which CE
1
(Γ) = G(Γ). It
is known that Γ representing bipartite Bell sce-
narios [55] satisfy the property CE
1
(Γ) = G(Γ)
and we have provided a generic recipe for con-
verting any Γ that does not satisfy structural
Specker’s principle to a Γ
0
that does satisfy it
so that CE
1
0
) = G
0
). The question is if there
are any other Γ that also satisfy CE
1
(Γ) = G(Γ).
2. Almost quantum theory: We know that an almost
quantum theory cannot satisfy Specker’s princi-
ple [49] but it satisfies statistical Specker’s princi-
ple (or consistent exclusivity). An open question
that remains is:
Can an almost quantum theory satisfy structural
Specker’s principle?
If not, this would render the satisfaction of con-
sistent exclusivity by an almost quantum theory
unexplained by a natural structural feature of
measurements in the theory, namely, the satisfac-
tion of structural Specker’s principle, i.e., almost
quantum theory would not fall in the category of
operational theories envisaged in Ref. [30].
3. Conditions for saturating the noise-robust non-
contextuality inequalities:
As mentioned in Section 6.2, it is an open ques-
tion whether the noise-robust noncontextuality
inequalities of Eq. (72) based on our generaliza-
tion of the CSW framework [22] can be saturated
by a noncontextual ontological model.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 32
More generally, the status of these noise-robust
noncontextuality inequalities vis-`a-vis the algo-
rithmic approach of Ref. [17] for finding necessary
and sufficient conditions for noncontextuality in
a general prepare-and-measure scenario remains
to be explored. One would suspect that the al-
gorithmic approach of Ref. [17] when adapted
to the kind of situation considered in this paper
would yield nontrivial noncontextuality inequali-
ties that aren’t merely generalizations of the ones
obtained in the CSW framework [22]. It would be
interesting to investigate the full structure of this
set of inequalities and compare it with the facet-
defining Bell-KS inequalities of the CSW frame-
work.
4. Properties of the weighted max-predictability,
β
G
, q):
Since the crucial new hypergraph-theoretic ingre-
dient in our inequalities is the weighted max-
predictability, it would be interesting to under-
stand properties of this hypergraph invariant on
both counts: as a new mathematical object in
its own right, one we haven’t been able to find
a reference to in the hypergraph theory litera-
ture, as well as an important parameter of a hy-
pergraph relevant for noise-robustness of a noise-
robust noncontextuality inequality. Indeed, as we
point out in Footnote 34, identifying a distribu-
tion q (in the definition of Corr, Eq. (53)) that
minimizes β
G
, q) for a given Γ
G
would lead
to better noise-robustness in the inequalities of
Eqs. (70) or (71).
5. Noise-robust applications of quantum protocols
based on KS-contextuality:
A general research direction is to construct noise-
robust versions of applications that have previ-
ously been suggested for KS-contextuality. Our
approach provides a recipe for doing this for
any Bell-KS inequality appearing in such applica-
tions. Besides serving as a witness for strong non-
classicality [56] (i.e., Spekkens contextuality),
37
noise-robust versions of these applications can
help benchmark the experiments in terms of the
noise that can be tolerated while still witnessing
nonclassicality. Examples of such applications in-
clude those from Refs. [5863].
Acknowledgments
I would like to thank Andreas Winter for his com-
ments on an earlier version of some of these ideas, To-
bias Fritz for the ping-pong and the sing-song in which
37
As opposed to weak nonclassicality that can arise in epis-
temically restricted classical theories [57]. See also the talk at
Ref. [56], 41:43 minutes, for a short discussion.
we often talked about hypergraphs, Rob Spekkens
for the often argumentative but always productive
conversations over lunch, and participants at the
Contextuality conference (CCIOSA) at Perimeter In-
stitute, during July 24 - 28, 2017, for very stimulat-
ing discussions that fed into the narrative of this pa-
per. I would also like to thank David Schmid, Ana
Bel´en Sainz, Elie Wolfe, and Tom´aˇs Gonda for helping
me better articulate the difference between structural
vs. statistical readings of Specker’s principle, and Eric
Cavalcanti for comments on the manuscript. Theorem
1 owes its origin to a discussion with Tom´aˇs Gonda. I
would also like to thank anonymous referees for sug-
gestions that immensely improved the presentation
of these results. Research at Perimeter Institute is
supported by the Government of Canada through the
Department of Innovation, Science and Economic De-
velopment Canada, and by the Province of Ontario
through the Ministry of Research, Innovation and Sci-
ence.
A Status of KS-contextuality as an ex-
perimentally testable notion of nonclas-
sicality for POVMs in quantum theory
The purpose of this section is to emphasize how the
progression from KS-contextuality to Spekkens con-
textuality for KS-type contextuality experiments is a
natural one rather than an ad hoc move from one
framework to another. That is, Spekkens contex-
tuality is not just another notion of nonclassical-
ity that is incomparable with KS-contextuality, but
is indeed intimately connected in its motivations to
the limitations of KS-contextuality [18]. In partic-
ular, we will focus on the role of KS-contextuality
with respect to POVMs and why allowing arbitrary
POVMs poses a difficulty for KS-contextuality as
a notion of nonclassicality that is experimentally
testable, i.e., a notion that applies to noisy measure-
ments (POVMs) typically implemented in a labora-
tory experiment.
38
While one may be tempted to
reject this premise for assessing the suitability of KS-
contextuality as a notion of nonclassicality claim-
ing instead that KS-contextuality was never meant
for POVMs and applies only to “purified” experi-
ments (namely, ones with only PVMs and pure states)
the reasons for doing so are rooted in the litera-
ture on KS-contextuality where POVMs have indeed
been considered and (at least) two kinds of conclu-
sions drawn: one, that there exists a Kochen-Specker
contradiction for POVMs, even on a qubit, so KS-
contextuality for POVMs is interesting [64] and two,
that allowing arbitrary POVMs in assessing nonclas-
sicality would make the research program of identify-
38
And how a rather compelling way to arrive at a notion that
is experimentally testable is Spekkens contextuality.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 33
ing device-independent principles for quantum corre-
lations in KS-contextuality experiments ill-defined, so
quantum correlations allowing arbitrary POVMs are
“pathological” [73]. We will look at these arguments
in turn and use the latter, in particular, to segue into
our motivations for the framework proposed in this
paper.
A.1 Limitations of KS-contextuality vis-
`
a-vis
POVMs
A.1.1 KS-contextuality for POVMs in the literature
The first paper that applied KS-contextuality to the
case of POVMs was by Cabello [64] where a KS-
uncolourability argument for POVMs on a single
qubit was proposed. This was motivated by the
Gleason-type derivation of the Born rule starting with
the structure of POVMs due to Busch [66] and Caves
et al. [67], analogous to the case of the Kochen-
Specker theorem [19] which can be seen as motivated
by Gleason’s theorem [68]. Insofar as there exists a
Gleason-type theorem for POVMs [66, 67], one could
motivate KS-contextuality as a reasonable notion of
nonclassicality for POVMs, as was presumably the
case in Ref. [64]. The role of this notion of nonclas-
sicality is then just to argue using a finite set of
POVM elements that no KS-noncontextual assign-
ment of outcomes is possible for certain finite sets
of POVMs in quantum theory. Should we, however,
assume that it is reasonable to demand determinis-
tic assignment of outcomes to POVM elements in an
ontological model, just as we do for PVM elements?
The argument of Ref. [64] was later criticized on var-
ious counts [18, 28, 65] and we refer the reader to
Ref. [28] for criticisms pertinent to this paper, namely,
that outcome determinism for all unsharp measure-
ments (ODUM in Ref. [28]) in quantum theory is un-
tenable.
39
Other works in the literature where KS-
contextuality for POVMs has been explored include
Refs. [6972].
Besides, doubts about the experimental testability
of the KS theorem were raised in the late ‘90s in a se-
ries of papers by Meyer, Clifton, and Kent [7476]. A
review can be found in Ref. [77]. These doubts were
premised on the idea that the set of KS-colourable
projectors (or PVMs) on any given Hilbert space is
dense in the set of all projectors (or PVMs) on that
Hilbert space. That is, for any given set of PVMs
yielding a KS contradiction, it is always possible to
find PVMs which are arbitrarily “close” to the PVMs
required for a KS contradiction (for any finite preci-
sion) but which do not themselves lead to a KS contra-
diction. The property of denseness of KS-colourable
39
Ref. [28] is also a good resource for a detailed analysis of
arguments concerning dilations of POVMs, which we will not
get into here. Besides, it also provides a principled recipe for
assigning response functions to POVMs.
sets of measurements in the set of all measurements
in fact extends to even the most general case when
the measurements are POVMs on any Hilbert space.
So, even a KS contradiction for POVMs (such as the
one in Ref. [64]) falls prey to the Meyer-Clifton-Kent
argument [77]. As Ref. [77] notes:
Dealing with projective measurements is
arguably not enough. One quite popular
view of quantum theory holds that a cor-
rect version of the measurement rules would
take POV measurements as fundamental,
with projective measurements either as spe-
cial cases or as idealisations which are never
precisely realised in practice. In order to de-
fine an NCHV theory catering for this line of
thought, Kent constructed a KS-colourable
dense set of positive operators in a complex
Hilbert space of arbitrary dimension, with
the feature that it gives rise to a dense set of
POV decompositions of the identity (Kent,
1999). Clifton and Kent constructed a dense
set of positive operators in complex Hilbert
space of arbitrary dimension with the special
feature that no positive operator in the set
belongs to more than one decomposition of
the identity (Clifton & Kent, 2000). Again,
the resulting set of POV decompositions is
dense, and the special feature ensures that
one can average over hidden states to recover
quantum predictions.
Hence, in any finite precision experiment it would
be impossible to test the Kochen-Specker theorem,
i.e., such an experimental test would require an in-
finitely precise measurement and measurements in a
real-world laboratory are never infinitely precise. Al-
though there was a lively debate along these lines
(see the references in [77]), the resolutions that were
proposed all involved modifying the notion of KS-
noncontextuality by adding auxiliary assumptions
that seek to exclude the Meyer-Clifton-Kent type ar-
guments. A recent attempt in this direction can be
found in Ref. [78] where a notion of “ontological faith-
fulness” is proposed. As such, it was already recog-
nized for reasons independent of Spekkens contex-
tuality [18] that the notion of KS-noncontextuality
needs to be revised if one is to make it experimen-
tally testable.
40
What Spekkens brought to the fore
[18], besides generalizing the notion of contextuality
to all experimental procedures rather than measure-
ments alone, was the idea that an experimental test of
noncontextuality should not rely on inequalities that
presume outcome determinism, just as a test of local
causality does not require the assumption of outcome
40
Of course, this takes nothing away from the importance of
the Kochen-Specker theorem [19] as a no-go theorem concerning
the logical structure of quantum theory and the constraints it
places on the ontological models possible for the theory.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 34
determinism. Indeed, the assumption of outcome de-
terminism for sharp measurements in quantum the-
ory is derived in the Spekkens framework from the as-
sumption of preparation noncontextuality rather than
being assumed independently.
We will now consider the more modern approach
to KS-contextuality along the lines of the frameworks
in Refs.[22, 23, 25] to segue into our framework for
Spekkens contextuality which we develop in this pa-
per.
A.1.2 Classifying probabilistic models: restriction of
quantum models to PVMs
Research on KS-contextuality took a different turn
with the advent of the graph-theoretic framework of
Cabello, Severini and Winter in 2010 [21] (revised
slightly in 2014 [22]), the sheaf-theoretic framework of
Abramsky and Brandenburger in 2011 [25], and the
hypergraph based formalism of Ac`ın, Fritz, Leverrier,
and Sainz in 2012 [23]. The unifying theme of these
contributions was that they took the key mathemat-
ical idea underlying KS-noncontextuality and Bell-
locality — namely, that both are instances of the clas-
sical marginal problem [26, 32, 33] — and built frame-
works that sought to distinguish between classical the-
ories (namely, those admitting KS-noncontextual on-
tological models), quantum theory, and post-quantum
general probabilistic theories by classifying their em-
pirical predictions relative to a Kochen-Specker ex-
periment into these categories. All these frameworks,
motivated by the device-independence paradigm, es-
chewed the erstwhile restriction of the notion of KS-
noncontextuality to quantum theory and sought to
make their analysis theory-independent, relying only
on empirical predictions relative to a KS experiment
to classify theories. They separated the assumption
of KS-noncontextuality from the operational theory
namely, quantum theory to which it was originally
meant to apply, allowing arbitrary operational theo-
ries in their analysis. However, there was a key dis-
tinction between Bell scenarios and KS-contextuality
scenarios that was lost in this formal unification:
namely, that while the definition of a quantum proba-
bilistic model in a Bell scenario need not be restricted
to (local) PVMs (and arbitrary local POVMs can be
allowed without changing the set of quantum mod-
els), the same is not true of a KS-contextuality sce-
nario. Indeed, as Henson and Sainz note in their work
[73],
41
reflecting on the question of allowing arbitrary
POVMs in the definition of a quantum probabilistic
model:
...if we allow general POVMs rather than
projective measurements then no principle
41
Proposing a principle bounding the KS-contextuality pos-
sible in quantum theory, namely, “Macroscopic Noncontextual-
ity”.
that places a non-trivial restriction on cor-
relations will be respected. Thus, this kind
of “quantum model” is clearly pathological.
One way to motivate the present work is as a re-
sponse to the pathology that Henson and Sainz allude
to: that trivial POVMs can realize any probabilistic
model, hence allowing arbitrary POVMs makes the
problem of finding principles to identify quantum cor-
relations in KS-contextuality scenarios trivial, i.e., all
probabilistic models are quantum and there is nothing
to be learnt about post-quantum probabilistic mod-
els. This is because any set of probabilities satisfying
the “no-disturbance” or “no-signalling” condition (of
which the E1 correlations of CSW [22] are a subset,
in general) can be achieved by (trivial) POVMs by
simply multiplying an identity operator with every
probability in such an assignment of probabilities.
42
By the lights of KS-noncontextuality as one’s notion
of classicality, then, trivial POVMs saturating the
general probabilistic bound on the correlations would
seem to be maximally nonclassical (i.e., maximally
KS-contextual). To avoid such “pathological” quan-
tum models, they restrict the definition of a quantum
model to allow only projective measurements. Indeed,
with recent work on a sensible notion of “sharp” mea-
surement in a general probabilistic theory [30, 31],
an appeal to the “fundamental sharpness” of all mea-
surements (see, e.g., [29]) is made to restrict attention
to sharp measurements in both quantum theory and
general probabilistic theories.
On the other hand, the approach in this paper is
different. In particular, we want our approach to cap-
ture the intuition that trivial POVMs are “classical”
(and not pathological), so we must go beyond KS-
noncontextuality. A simple operational sense in which
trivial POVMs are “classical” is that they reveal noth-
ing about the quantum state on which they are mea-
sured, being incapable of distinguishing any pair of
states whatsoever.
43
The correlations (denoted by
R([s|S])) usually examined in a KS-contextuality ex-
periment do not allow such experiments to witness
the “triviality” of trivial POVMs, i.e., the fact that
they correspond to a fixed probability distribution
that doesn’t vary even as the choice of preparation
is varied. Moreover, since all nonprojective mea-
surements are excluded by fiat in traditional Kochen-
Specker type approaches [22, 23] for reasons alluded
to by Henson and Sainz [73], one loses out on the po-
tential to explore the possibilities that nontrivial and
nonprojective measurements offer with respect to con-
42
Trivial POVMs are, therefore, trivial resolutions of the
identity, where every POVM element is proportional to identity,
i.e., {aI}
a
, such that a [0, 1] and
P
a
a = 1.
43
Indeed, any trivial POVM can be realized in the follow-
ing operational manner: take the quantum system prepared in
some state, throw it in the garbage, and then sample from the
classical probability distribution corresponding to the trivial
POVM.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 35
textuality.
44
Our approach, therefore, is to allow arbi-
trary POVMs when considering probabilistic models
arising from quantum theory (and not restricting to
any notion of “sharp measurements” in general prob-
abilistic theories) but examine more quantities than
are examined in traditional approaches, i.e., besides
the quantity R typical in a KS-contextuality scenario,
we invoke the quantity Corr to account for noise in the
measurements.
If one restricts attention to operational theories
that can always achieve Corr = 1 for any KS-
contextuality scenario, then the usual classification
of probabilistic models following Refs. [22, 23] holds
(Eq. (72)). What is of interest in our framework, how-
ever, is the tradeoff between R and Corr: how large
can both R and Corr be in an operational theory?
(See Eq. (72).)
A.2 Robustness of Bell nonlocality vis-
`
a-vis
POVMs
Note that whenever we refer to “Bell-KS” functionals
or inequalities for Kochen-Specker type experiments,
we are not thinking of experiments that are Bell ex-
periments [4, 5, 7, 911], which have spacelike sepa-
ration between multiple parties, each performing lo-
cal measurements on a shared multipartite prepara-
tion. For the case of Bell experiments, trivial local
POVMs assigned to each party in a Bell experiment
do not lead to Bell violations for a simple reason: the
trivial POVMs for each party are all compatible with
each other, thereby admitting a joint probability dis-
tribution over their outcomes for each party; taking a
product of these local joint probability distributions
(one for each party) results in a joint distribution over
all measurements of all parties, hence satisfying Bell
inequalities. The fact that the POVMs are trivial
ensures that the Bell inequalities are satisfied regard-
less of the choice of shared quantum state. On the
other hand, forgetting the constraint of local POVMs,
there always exist global trivial POVMs that can vi-
olate Bell inequalities: e.g., just take the Popescu-
Rohrlich (PR) box distribution [43], and multiply an
identity operator (on the joint Hilbert space of Al-
ice and Bob) with each probability in the PR-box;
this results in four trivial POVMs, defined over the
joint Hilbert space, that together violate the CHSH
inequality maximally. But, of course, this violation
is uninteresting because it doesn’t obey the locality
constraint on the measurements in a Bell experiment.
This is mathematically reflected in the fact that the
PR-box distribution cannot be written as a convex
mixture of product distributions, one for each party,
hence the corresponding trivial POVM cannot be un-
44
All trivial POVMs are nonprojective, but not all nonprojec-
tive POVMs are trivial. Indeed, see Refs. [3942] for examples
of generalized contextuality [18] with nonprojective measure-
ments, albeit assuming operational quantum theory.
derstood in terms of trivial local POVMs. Hence, it is
the locality of the trivial POVMs in a Bell experiment
that prevents them from violating a Bell inequality
and renders them non-pathological, unlike in the case
of KS-contextuality. The fact that they are “trivial”
in the sense of being unable to distinguish two quan-
tum states plays a role in the sense that, regardless of
the shared quantum state, these POVMs yield fixed
distributions over the measurement outcomes, thus
always allowing the construction of a fixed (that is,
independent of the quantum state) global joint prob-
ability distribution over all measurements in a Bell
scenario. Since there are no such locality constraints
on the form of the POVM elements in a Kochen-
Specker experiment, they can easily violate any KS-
noncontextuality inequality, e.g., the two-party CHSH
experiment considered as a Kochen-Specker experi-
ment with four observables in a 4-cycle where ad-
jacent pairs are jointly measurable allows for trivial
POVMs (like the PR-box trivial POVM above) violat-
ing the CHSH-type Bell-KS inequality in this scenario
maximally. By the lights of KS-noncontextuality, this
violation would indicate the maximum possible KS-
contextuality with respect to this CHSH-type inequal-
ity.
45
For all these reasons, our discussion of KS-
noncontextuality as a notion of classicality in an
experiment with no locality constraints on the mea-
surements does not extend to the case of Bell-
locality (or local causality) as a notion of classicality
in a Bell experiment, where the experiment must re-
spect locality constraints on the measurements for a
Bell inequality violation to be meaningful.
The unification of Bell nonlocality and KS-
contextuality `a la Refs. [22, 23, 25] forces a certain
dichotomy in these approaches: while in Bell scenar-
ios, one need not restrict to any notion of a “sharp”
measurement in the definition of probabilistic models
(and thus claim “theory independence”), in Kochen-
Specker scenarios, one must make some statement
about the nature of the measurements (concerning
their presumed sharpness [29], or that their joint mea-
surability [42, 44] is restricted to commutativity [25]),
rendering any putative “theory independence” claim
(on a level at par with Bell nonlocality) unfounded.
46
45
See Appendix C for more discussion.
46
See Ref. [27] for how this lack of locality of measurements in
a Kochen-Specker type experiment translates, at the ontolog-
ical level, to the unreasonableness of assuming factorizability
in the ontological model; this factorizability (or the stronger
condition of outcome determinism) is invoked to justify the re-
sulting derivation of Bell-KS inequalities as constraints from a
classical marginal problem.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 36
B Ontological models without respect-
ing coarse-graining relations
Here we will construct explicit examples where the
coarse-graining relations are not respected in an on-
tological model, in contrast to the requirement on the
representation of coarse-grainings that we invoked in
Section II.C of the main text. The goal is to empha-
size that the requirement of Section II.C is necessary
not only for the treatment of Spekkens contextuality
but also for Kochen-Specker contextuality. Below, we
first demonstrate how a “KS-noncontextual” model
can be constructed for any scenario that proves the
KS theorem by using the example of the KCBS setup
[47]. We then proceed to demonstrate how a “prepa-
ration and measurement noncontextual” model can be
constructed in a similar way whenc considering gen-
eralized noncontextuality [18].
B.1 How to construct a “KS-noncontextual”
ontological model of the KCBS experiment [47]
without coarse-graining relations
Here we have that M contains at least the follow-
ing measurement settings: {M
i
}
5
i=1
, each with three
possible outcomes, m
i
{0, 1, 2}. The measurement
events for each measurement setting M
i
can be coarse-
grained into two different ways, defining new measure-
ment settings M
0
i
(with outcomes m
0
i
{0,
¯
0}) and
M
00
i
(with outcomes m
00
i
{2,
¯
2}), where the coarse-
graining relations are given by
[0|M
0
i
] [0|M
i
], (123)
[
¯
0|M
0
i
] [1|M
i
] + [2|M
i
], (124)
[2|M
00
i
] [2|M
i
], (125)
[
¯
2|M
0
i
] [0|M
i
] + [1|M
i
]. (126)
In the operational theory, these coarse-graining rela-
tions are respected, i.e., for all [s|S], s V
S
, S S,
p(0, s|M
0
i
, S) p(0, s|M
i
, S), (127)
p(
¯
0, s|M
0
i
, S) p(1, s|M
i
, S) + p(2, s|M
i
, S), (128)
p(2, s|M
00
i
, S) p(2, s|M
i
, S), (129)
p(
¯
2, s|M
00
i
, S) p(0, s|M
i
, S) + p(1, s|M
i
, S). (130)
However, we do not require that these relations be
respected in an ontological model. Now, the KCBS
argument requires the following operational equiva-
lences,
[2|M
00
i
] ' [0|M
0
i+1
], (131)
for all i {1, 2, 3, 4, 5}, where addition is modulo 5,
so that i + 1 = 1 for i = 5. A KS-noncontextual
ontological model for this experiment requires that
ξ(2|M
00
i
, λ) = ξ(0|M
0
i+1
, λ) {0, 1}, λ Λ. (132)
Constructing such a model requires one to spec-
ify response functions for the measurements
{M
i
, M
0
i
, M
00
i
}
5
i=1
. However, since there are no
constraints from coarse-graining relations on these
response functions, there is no obstruction to
the construction of a “KS-noncontextual model”
of this type for any set of operational statis-
tics. In particular, since we do not require that
λ Λ : ξ(0|M
0
i+1
, λ) ξ(0|M
i+1
, λ), nor that
λ Λ : ξ(2|M
00
i
, λ) ξ(2|M
i
, λ), we can assign
arbitrary response functions to {M
0
i
, M
00
i
}
5
i=1
, subject
only to the condition from KS-noncontextuality that
λ Λ : ξ(2|M
00
i
, λ) = ξ(0|M
0
i+1
, λ) {0, 1}.
47
Note that, because coarse-graining relations
are not respected, this does not imply that
λ Λ : ξ(2|M
i
, λ) = ξ(0|M
i+1
, λ) {0, 1}, which is
the usual constraint we would have presumed from
KS-noncontextuality when coarse-graining relations
are respected in the ontological model. In the absence
of any such constraints on the response functions for
{M
i
}
5
i=1
, one can always reproduce their operational
statistics, in particular the operational equivalences
of the type [2|M
i
] ' [0|M
i+1
], which follow from
Eqs. (123),(125), and (131).
B.2 How to construct a “preparation and
measurement noncontextual” ontological model
without coarse-graining relations
Just as for measurements in the case of KS-
noncontextuality, abandoning the coarse-graining re-
lations for preparations in the case of generalized non-
contextuality [18] makes possible the existence of a
“preparation and measurement noncontextual” on-
tological model for any set of operational statistics.
For the kinds of proofs of contextuality relevant to
this article, the relevant notion of coarse-graining is
that of complete coarse-graining: that is, consider
two source settings S and S
0
with (respective) source
events {[s|S]}
sV
S
and {[s
0
|S
0
]}
s
0
S
0
, that can be com-
pletely coarse-grained to yield the operational equiva-
lence [>|S
>
] ' [>|S
0
>
], cf. Eq. (18). In the operational
description, where we assume the coarse-graining re-
lation is respected, this is represented by
[m|M], m V
M
, M M :
X
s
p(m, s|M, S) =
X
s
0
p(m, s
0
|M, S
0
). (133)
In the ontological description, however, we do not
impose the coarse-graining relations µ(λ, >|S
>
)
P
s
µ(λ, s|S) and µ(λ, >|S
0
>
)
P
s
0
µ(λ, s
0
|S
0
), which
makes it trivial to write down probability dis-
tributions µ(λ, >|S
>
) and µ(λ, >|S
0
>
) such that
µ(λ, >|S
>
) = µ(λ, >|S
0
>
) (as required by prepara-
tion noncontextuality applied to [>|S
>
] ' [>|S
0
>
])
but where we do not require that
P
s
µ(λ, s|S) =
47
This “KS-noncontextual” ontological model will thus repro-
duce operational equivalences of the type [2|M
00
i
] ' [0|M
0
i+1
]
(cf. Eq. (131)).
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 37
P
s
µ(λ, s|S) (which is not required by preparation
noncontextuality). Note how the refusal to re-
spect the coarse-graining relations, i.e., identifying
µ(λ, >|S
>
) with
P
s
µ(λ, s|S) and µ(λ, >|S
0
>
) with
P
s
0
µ(λ, s
0
|S
0
), lifts the constraint from preparation
noncontextuality that would have been in place if the
coarse-graining relations were respected. The same
refusal for the case of measurements lifts any con-
straints (just as in the case of KS-noncontextuality
above) from measurement noncontextuality on the on-
tological model. It thus becomes trivial to construct
a “preparation and measurement noncontextual” on-
tological model without coarse-graining relations.
C Trivial POVMs
C.1 Bell-CHSH scenario
We have the Hilbert space H
A
H
B
for Alice
(H
A
) and Bob (H
B
). Consider four binary-outcome
POVMs, {A
(0)
, A
(1)
, B
(0)
, B
(1)
}, where
A
(0)
{A
(0)
0
, A
(0)
1
},
A
(1)
{A
(1)
0
, A
(1)
1
},
B
(0)
{B
(0)
0
, B
(0)
1
},
B
(0)
{B
(1)
0
, B
(1)
1
}, (134)
0 A
(0)
0
, A
(1)
0
I
H
A
, 0 B
(0)
0
, B
(1)
0
I
H
B
, A
(0)
0
+
A
(0)
1
= A
(1)
0
+ A
(1)
1
= I
H
A
, and B
(0)
0
+ B
(0)
1
= B
(1)
0
+
B
(1)
1
= I
H
B
. The quantum probability, given a shared
quantum state ρ
AB
defined on H
A
H
B
, is given by
p(a, b|x, y) = Tr(ρ
AB
A
(x)
a
B
(y)
b
), (135)
for a, b, x, y {0, 1}. Here A
(x)
I
H
B
is jointly mea-
surable with I
H
A
B
(y)
, just because of the commuta-
tivity of their respective POVM elements. The joint
observable being measured is A
(x)
B
(y)
. Now, con-
sider the case when all the POVM elements are triv-
ial, i.e., A
(x)
a
= q
(x)
a
I
H
A
and B
(y)
b
= r
(y)
b
I
H
B
, for some
q
(x)
a
, r
(y)
b
[0, 1] for all a, b, x, y {0, 1}. We then
have
p(a, b|x, y) = q
(x)
a
r
(y)
b
, a, b, x, y {0, 1}. (136)
A global joint probability distribution which repro-
duces the above as marginals is simply given by their
product:
p(a
(0)
, a
(1)
, b
(0)
, b
(1)
) q
(0)
a
(0)
q
(1)
a
(1)
r
(0)
b
(0)
r
(1)
b
(1)
. (137)
Hence, trivial POVMs never violate any Bell-CHSH
inequality for this scenario.
C.2 CHSH-type contextuality scenario: 4-
cycle
We now consider the Bell-CHSH scenario without the
constraint of spacelike separation. What the lack of
spacelike separation means from the quantum per-
spective is that one no longer needs to model this
spacelike separation by requiring a tensor product
structure, or (more generally) by requiring the com-
mutativity of the observables that are jointly mea-
sured [25, 79, 80]. That is, there is no physical justi-
fication for imposing the tensor product structure or
the commutativity of jointly measured observables.
48
Thus, we have the Hilbert space H and we consider
four binary-outcome POVMs, {A
(0)
, A
(1)
, B
(0)
, B
(1)
},
on H, where
A
(0)
{A
(0)
0
, A
(0)
1
},
A
(1)
{A
(1)
0
, A
(1)
1
},
B
(0)
{B
(0)
0
, B
(0)
1
},
B
(0)
{B
(1)
0
, B
(1)
1
}, (138)
0 A
(0)
0
, A
(1)
0
, B
(0)
0
, B
(1)
0
I
H
, A
(0)
0
+ A
(0)
1
=
A
(1)
0
+ A
(1)
1
= B
(0)
0
+ B
(0)
1
= B
(1)
0
+ B
(1)
1
= I
H
.
Further, the following sets of POVMs are jointly
measurable: {A
(0)
, B
(0)
}, {A
(0)
, B
(1)
}, {A
(1)
, B
(0)
},
{A
(1)
, B
(1)
}. The most general joint observable for a
pair of compatible POVMs {A
(x)
, B
(y)
} is given by
a POVM G
(xy)
{G
(xy)
00
, G
(xy)
01
, G
(xy)
10
, G
(xy)
11
} (that
isn’t necessarily unique [42]) such that: G
(xy)
00
+
G
(xy)
01
= A
(x)
0
, G
(xy)
10
+ G
(xy)
11
= A
(x)
1
, G
(xy)
00
+ G
(xy)
10
=
B
(y)
0
, G
(xy)
01
+ G
(xy)
11
= B
(y)
1
. In particular, if (and
only if) the POVMs A
(x)
and B
(y)
commute, we can
construct the joint POVM as a product: G
(xy)
ab
=
A
(x)
a
B
(y)
b
for all a, b, x, y {0, 1}. In the absence of
such commutativity, the joint POVM cannot be writ-
ten as a product.
The quantum probability, given a quantum state ρ
on H, is given by
p(a, b|x, y) = Tr(ρG
(xy)
ab
), (139)
for a, b, x, y {0, 1}. Note that this probability de-
pends on the joint measurement G
(xy)
implementing
A
(x)
and B
(y)
together, and that, in general, there
may be multiple choices of G
(xy)
possible. This is
easy to see since there is one undetermined positive
operator in the joint measurement that is not fixed by
A
(x)
or B
(y)
, i.e., we can write the POVM elements of
G
(xy)
as: G
(xy)
01
= A
(x)
0
G
(xy)
00
, G
(xy)
10
= B
(y)
0
G
(xy)
00
,
48
On the other hand, what this lack of spacelike separation
means from the perspective of an ontological model is that one
no longer has a justification for assuming factorizability [25]
and, consequently, the generalization of Fine’s theorem [26] fails
to prove that there is no loss of generality in assuming outcome
determinism in discussions of KS-contextuality (unlike the case
of Bell scenarios, where factorizability is justified by spacelike
separation); there is a definite loss of generality, in that mea-
surement noncontextual and outcome-indeterministic ontologi-
cal models that are non-factorizable are not empirically equiva-
lent to measurement noncontextual and outcome-deterministic
(or KS-noncontextual) ontological models. See Ref. [27] for a
discussion of this aspect.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 38
G
(xy)
11
= I A
(x)
0
B
(y)
0
+ G
(xy)
00
, where G
(xy)
00
is a posi-
tive semidefinite operator satisfying A
(x)
0
+B
(y)
0
I
H
G
(xy)
00
A
(x)
0
, B
(y)
0
. Here G
(xy)
00
represents the freedom
in the choice of how the joint measurement might be
implemented within quantum theory. This freedom
reflects the fact that since the jointly measured ob-
servables are no longer spacelike separated, it is pos-
sible to introduce correlations between them that are
stronger than what is allowed in the corresponding
Bell scenario in quantum theory. The strength of
these correlations is only limited by the constraints on
G
(xy)
00
imposed by the marginal observables A
(x)
and
B
(y)
. This is in contrast to the case where A
(x)
and
B
(y)
are spacelike separated observables and the only
choice of joint POVM consistent with spacelike sepa-
ration is fixed by G
(xy)
00
= A
(x)
0
B
(y)
0
, i.e., the strength
of correlations between A
(x)
and B
(y)
is fixed entirely
by them and there is no freedom in choosing G
(xy)
.
Thus, we have that A
(x)
is jointly measurable with
B
(y)
and G
(xy)
denotes a joint POVM of A
(x)
and
B
(y)
. Now, consider the case when all the POVM
elements are trivial, i.e., A
(x)
a
= q
(x)
a
I
H
and B
(y)
b
=
r
(y)
b
I
H
, for some q
(x)
a
, r
(y)
b
[0, 1] for all a, b, x, y
{0, 1}.
In particular, consider the case where q
(x)
a
= r
(y)
b
=
1
2
for all a, b, x, y {0, 1}. A possible joint POVM for
these trivial POVMs is then the product POVM:
G
(xy)
ab
= A
(x)
a
B
(y)
b
=
1
4
I
H
. (140)
If one restricted joint measurability of A
(x)
and B
(y)
to just commutativity — a sufficient but not necessary
condition for joint measurability
49
[44] we would
take the above choice of the product POVM as a “nat-
ural” one. Being a product of trivial POVMs, this
choice will never lead to a violation of the CHSH-
type inequality for this scenario. Indeed, the struc-
ture of a Bell scenario requiring the decomposi-
tion of the Hilbert space as H = H
A
H
B
(tensor
product paradigm), or more generally, imposing the
commutativity requirement [A
(x)
a
, B
y
b
] = 0 (commu-
tativity paradigm) is such that the only possible
choice of joint measurement that can be implemented
by spacelike separated parties is the one that cor-
responds to the product POVM, given by operators
G
(xy)
ab
= A
(x)
a
B
(y)
b
.
However, this is not the only allowed joint mea-
surement for these trivial POVMs, particularly when
there is no locality constraint on the measurements
from spacelike separation.
50
An extreme choice of
49
Particularly in the absence of spacelike separation. It is the
need to model spacelike separation in a quantum Bell exper-
iment that makes commutativity a necessary (and sufficient)
condition for joint measurability of spacelike separated observ-
ables in a Bell scenario
50
To incorporate such a constraint, spacelike separation
needs to be modelled via either the tensor product paradigm
joint POVM is the following:
G
P R(xy)
ab
=
I
H
2
δ
ab,xy
, (141)
which leads to the probability distribution
p(a, b|x, y) =
1
2
δ
ab,xy
for any choice of quan-
tum state. Hence, this joint POVM G
P R(xy)
always
yields statistics corresponding to the PR-box, max-
imally violating the CHSH-type inequality for this
scenario, namely,
X
a,b,x,y
ab=xy
1
4
p(a, b|x, y)
3
4
. (142)
Physically, it’s possible to implement this (without
requiring any quantum resources) by providing a box
that always produces these correlations between mea-
surement settings denoted by (xy) {0, 1}
2
, regard-
less of the input state. Such a black-box would maxi-
mally violate the CHSH-type inequality (viewed as a
Bell-KS inequality witnessing KS-contextuality), but
that shouldn’t be surprising in the absence of space-
like separation. Also, the trivial PR-box joint POVM
G
P R(xy)
ab
is a perfectly valid way to implement the
joint measurement of trivial POVMs A
(x)
and B
(y)
within the standard paradigm of operational quan-
tum theory.
51
To summarize, we note the following:
Within the traditional framework of KS-
noncontextuality, if one wants to go beyond pro-
jective measurements to arbitrary POVMs in a
contextuality scenario, then one must in order
to avoid the pathology of trivial POVMs violat-
ing the Bell-KS inequalities maximally restrict
by fiat the notion of joint measurability to merely
commutativity. This is, for example, the attitude
adopted in Ref. [25].
or the commutativity paradigm. Both these ways of modelling
spacelike separation lead to the same set of quantum corre-
lations for any finite-dimensional Hilbert space H [79]. The
question of whether the two paradigms lead to the same set of
correlations in the case of infinite dimensional Hilbert spaces
is the subject of Tsirelson’s problem [79, 80]. Most studies
of Bell-nonlocality are primarily concerned with finite dimen-
sional Hilbert spaces; should one encounter infinite dimensional
Hilbert spaces, the commutativity paradigm is the proper way
to model spacelike separation.
51
Note that the point of this demonstration is to show how,
in the absence of spacelike separation justifying commutativ-
ity or a promise that the measurements are sharp, arbitrary
correlations are achievable in quantum theory if unsharp mea-
surements are allowed. All trivial POVMs are unsharp, but
the converse is not true. That is, one can consider nontriv-
ial POVMs that don’t violate the CHSH-type inequality maxi-
mally, but which violate it (arbitrarily) more than is allowed by
sharp measurements in quantum theory. One could construct
them, for example, by just taking a convex combination of the
PR-box trivial POVM with some sharp (and thus product) joint
POVM.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 39
However, if one is going beyond projective mea-
surements, we know that commutativity is only
a sufficient condition for joint measurability, not
a necessary one [44].
This brings us to our observation that the tra-
ditional notion of KS-noncontextuality is patho-
logical once the most general situation in quan-
tum theory is considered: arbitrary POVMs with
the general notion of joint measurability (see,
e.g., Ref. [44] for this notion and its relation to
commutativity). In particular, in the absence of
spacelike separation, there is no physical justifi-
cation to restrict the notion of joint measurability
to merely commutativity.
A similar consideration applies at the level of
a KS-noncontextual ontological model: there,
factorizability is not justified in the absence of
spacelike separation. So, on those grounds alone,
one should go beyond KS-noncontextuality as
one’s notion of classicality; particularly, if one
wants a notion of classicality that does not pre-
sume outcome determinism, just as local causality
doesn’t presume it. This was argued in Ref. [27]:
imagine an adversarial setting where because of
the absence of spacelike separation in a KS-
contextuality experiment, two measurement set-
tings on the same system can exhibit correla-
tions that are independent of those induced by
the system on which the measurements are be-
ing implemented, thus allowing them to exhibit
stronger correlations than are possible in a KS-
noncontextual model. We use trivial POVMs
only to drive home that this can be done arbi-
trarily well (achieving PR-box type correlations,
in fact) if there is no constraint on the strength
of correlations the measurement settings can ex-
hibit. The way such constraints on the corre-
lations between the measurement settings show
up in our analysis within the Spekkens frame-
work is in terms of the quantity Corr: if Corr
is really high, the measurements in a noncon-
textual ontological model cannot be arbitrarily
strongly correlated, i.e., R cannot be arbitrarily
high (cf. Eq. (72)).
D The KS-uncolourable hypergraph
Γ
18
It is instructive to consider the KS-uncolourable hy-
pergraph Γ
18
, originally appearing in Ref. [51], and
studied in the light of Spekkens contextuality in
Ref. [12]. This hypergraph fails both criteria for
the hypergraphs Γ considered in this paper, namely,
C(Γ) 6= (KS-colourability) and CE
1
(Γ) = G(Γ).
For probabilistic models on Γ
18
, the following hold:
C
18
) = ( CE
1
18
) ( G
18
). This was con-
Figure 8: The hypergraph Γ
27
and its subhypergraphs, i.e.,
Γ
18
and Γ
3
, appearing in the three Bell-KS expressions of
Eq. (143). The probabilistic model p considered in Eq. (143)
is a probabilistic model on Γ
27
, and not on the subhyper-
graphs. We have illustrated the subhypergraphs separately
only for clarity regarding the subsets of vertices to which the
Bell-KS expressions refer: the probabilities assigned to these
vertices are obtained from probabilistic models on Γ
27
.
sidered in Ref. [12], where CE
1
18
) excludes the
extremal probabilistic model in G
18
) that corre-
sponds to the upper bound on the noise-robust non-
contextuality inequality of Ref. [12]. As argued in
Ref. [12], this noise-robust noncontextuality inequal-
ity is the appropriate operational generalization (to
possibly noisy measurements) of the Kochen-Specker
contradiction first demonstrated in Ref. [51]; this gen-
eralization cannot be accommodated in our general-
ization of the CSW framework [22].
If one extends the KS-uncolourable Γ
18
to a KS-
colourable hypergraph Γ
27
with 9 “no-detection”
events, one for each hyperedge, then we have C
27
) 6=
, but it’s still the case that C
27
) ( CE
1
27
) (
G
27
) for this hypergraph.
52
Hence, Γ
27
cannot be
understood in our generalization of the CSW frame-
work either.
53
Indeed, if one “blindly” writes down a CSW clas-
sical bound for some Bell-KS expression defined on
52
This follows from noting that extremal probabilistic models
on Γ
18
are still extremal probabilistic models on Γ
27
: ones
where the no-detection events are assigned zero probabilities.
See Theorem 2.5.3 of Ref. [23].
53
Note that adding these no-detection events is equivalent
to allowing subnormalized probabilities (i.e., sum of probabili-
ties assigned to measurement events in a hyperedge can be less
than 1) on Γ
18
. Hence, even allowing for subnormalization on
Γ
18
, which means that one is looking at probabilistic models on
the hypergraph Γ
27
, does not eliminate the gap between CE
1
probabilistic models and general probabilistic models, so that
any upper bound on a Bell-KS expression given by probabilis-
tic models in CE
1
27
) is not always the same as the general
probabilistic upper bound from probabilistic models in G
27
).
The CSW framework only considers the upper bound given by
CE
1
27
) probabilistic models.
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 40
Figure 9: Going from the orthogonality graph, G, of Γ
18
to
the hypergraph Γ
G
(on the right) to which our noise-robust
noncontextuality inequality pertains.
O
18
), then such a bound is equivalently a bound
for the same Bell-KS expression defined on Γ
27
(where
normalization is restored). Further, the E1 bound on
Γ
18
is a CE
1
bound on Γ
27
. The GPT bound happens
to agree with the CE
1
bound for a particular Bell-KS
expression (sum of all probabilities) but differs for
some other Bell-KS expressions defined on this hy-
pergraph. Consider, for example, the following three
expressions (see Fig. 8):
Expr
1
X
vV
18
)
p(v),
Expr
2
X
vV
3
)
p(v),
Expr
3
X
vV
18
)
p(v) +
X
vV
3
)
p(v). (143)
We have:
Expr
1
C
27
)
8
CE
1
27
)
< 9
G
27
)
= 9,
Expr
2
C
27
)
1
CE
1
27
)
= 1
G
27
)
<
3
2
,
Expr
3
C
27
)
9
CE
1
27
)
< 10
G
27
)
< 10.5. (144)
Thus, Expr
3
is a Bell-KS expression that discrimi-
nates between probabilistic models at all three levels
of the hierarchy. Indeed, the upper bound on Expr
3
for CE
1
27
) models can be saturated by projective
quantum realizations of the hypergraph, in particular
the standard realization with 18 rays, with the zero
operator for the no-detection events [51]. The fact
that there exists such a Bell-KS expression as Expr
3
means that the CE
1
upper bounds from the CSW
approach can be violated by a general probabilistic
model, i.e., the upper bounds for CE
1
models and
general probabilistic models don’t agree, and we can-
not take the graph-theoretic upper bounds of CSW for
granted in our noise-robust noncontextuality inequal-
ities. Indeed, the general probabilistic upper bound
for any Bell-KS expression defined on a contextual-
ity scenario is a hypergraph invariant in the sense
that it is a property that is shared by all hypergraphs
isomorphic to each other that may or may not be
expressible as a graph invariant `a la CSW.
What, then, do the bounds given by graph invari-
ants of CSW for O
18
) mean in our generalization of
the CSW framework? Following our approach, out-
lined in Sec. III.B, we can go from G = O
18
) to the
hypergraph Γ
G
= Γ
O
18
)
(see Fig. 9) for which we
have (by construction) C
O
18
)
) 6= (so that the
underlying hypergraph is no longer KS-uncolourable)
and CE
1
O
18
)
) = G
O
18
)
) (so that, for any Bell-
KS expression, the upper bound given by the frac-
tional packing number α
(G, w) in the CSW frame-
work agrees with the general probabilistic upper
bound). Since this construction proceeds by con-
verting all maximal cliques in Γ
18
to hyperedges in
Γ
O
18
)
and adding a new vertex to each such hy-
peredge, it achieves both purposes: firstly, adding a
(no-detection) vertex to every maximal clique that is
a hyperedge in Γ
18
ensures the KS-colourability of
Γ
O
18
)
, i.e., C
O
18
)
) 6= , and secondly, adding a
vertex to every maximal clique that is not a hyperedge
in Γ
18
ensures that CE
1
O
18
)
) = G
O
18
)
). Once
these two properties are satisfied, the graph invari-
ants of CSW [22] become applicable to any Bell-KS
expression defined for any set of vertices in the sub-
hypergraph Γ
18
of Γ
O
18
)
.
Our noise-robust noncontextuality inequality then
applies to the KS-colourable hypergraph Γ
O
18
)
,
where the graph invariants of CSW make sense, rather
than the KS-uncolourable hypergraph Γ
18
. On the
other hand, an appropriate noise-robust noncontex-
tuality inequality for the KS-uncolourable hypergraph
Γ
18
is, then, the one reported in Ref. [12].
54
References
[1] L. Hardy, “Quantum Theory From Five Reason-
able Axioms”, arXiv:quant-ph/0101012 (2001).
[2] L. Masanes and M. P. Mueller, “A derivation
of quantum theory from physical requirements”,
New J. Phys. 13, 063001 (2011).
[3] G. Chiribella, G. M. D’Ariano, and P. Perinotti,
“Probabilistic theories with purification”, Phys.
Rev. A 81, 062348 (2010).
[4] J. S. Bell, “On the Einstein-Podolsky-Rosen para-
dox”, Physics 1, 195 (1964). Reprinted in Ref. [6],
Chapter 2.
54
The approach for KS-uncolourable hypergraphs will be fur-
ther developed in hypergraph-theoretic terms in forthcoming
work [34].
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 41
[5] J. S. Bell, “On the problem of hidden variables
in quantum mechanics”, Rev. Mod. Phys. 38, 447
(1966). Reprinted in Ref. [6], Chapter 1.
[6] J. S. Bell, “Speakable and Unspeakable in Quan-
tum Mechanics”, 2nd Edition, Cambridge Univer-
sity Press, 2004.
[7] J. F. Clauser, M. A. Horne, A. Shimony, and
R. A. Holt, “Proposed Experiment to Test Local
Hidden-Variable Theories”, Phys. Rev. Lett. 23,
880 (1969).
[8] N. Brunner, D. Cavalcanti, S. Pironio, V. Scarani,
and S. Wehner, “Bell nonlocality”, Rev. Mod.
Phys. 86, 419 (2014).
[9] B. Hensen et al., “Loophole-free Bell inequality vi-
olation using electron spins separated by 1.3 kilo-
metres”, Nature 526, 682 - 686 (2015).
[10] Lynden K. Shalm et al., “Strong Loophole-Free
Test of Local Realism”, Phys. Rev. Lett. 115,
250402 (2015).
[11] M. Giustina et al., “Significant-Loophole-Free
Test of Bell’s Theorem with Entangled Photons”,
Phys. Rev. Lett. 115, 250401 (2015).
[12] R. Kunjwal and R. W. Spekkens, “From the
Kochen-Specker Theorem to Noncontextuality In-
equalities without Assuming Determinism”, Phys.
Rev. Lett. 115, 110403 (2015).
[13] M. D. Mazurek, M. F. Pusey, R. Kunjwal, K.
J. Resch, R. W. Spekkens, “An experimental test
of noncontextuality without unphysical idealiza-
tions”, Nat. Commun. 7, 11780 (2016).
[14] A. Krishna, R. W. Spekkens, and E. Wolfe, “De-
riving robust noncontextuality inequalities from
algebraic proofs of the Kochen-Specker theorem:
the Peres-Mermin square”, New J. Phys 19,
123031 (2017).
[15] D. Schmid and R. W. Spekkens, “Contextual Ad-
vantage for State Discrimination”, Phys. Rev. X
8, 011015 (2018).
[16] R. Kunjwal and R. W. Spekkens, “From sta-
tistical proofs of the Kochen-Specker theorem
to noise-robust noncontextuality inequalities”,
Phys. Rev. A 97, 052110 (2018).
[17] D. Schmid, R. W. Spekkens, and E. Wolfe,
“All the noncontextuality inequalities for arbi-
trary prepare-and-measure experiments with re-
spect to any fixed set of operational equivalences”,
Phys. Rev. A 97, 062103 (2018).
[18] R. W. Spekkens, “Contextuality for prepara-
tions, transformations, and unsharp measure-
ments”, Phys. Rev. A 71, 052108 (2005).
[19] S. Kochen and E. P. Specker, “The Problem
of Hidden Variables in Quantum Mechanics”, J.
Math. Mech. 17, 59 (1967). Also available at JS-
TOR.
[20] N. Harrigan and R. W. Spekkens,“Einstein, In-
completeness, and the Epistemic View of Quan-
tum States,” Found. Phys. 40, 125 (2010).
[21] A. Cabello, S. Severini, and A. Winter, “(Non-
)Contextuality of Physical Theories as an Axiom”,
arXiv:1010.2163 [quant-ph] (2010).
[22] A. Cabello, S. Severini, and A. Winter, “Graph-
Theoretic Approach to Quantum Correlations”,
Phys. Rev. Lett. 112, 040401 (2014).
[23] A. Ac´ın, T. Fritz, A. Leverrier, and A. B. Sainz,
A Combinatorial Approach to Nonlocality and
Contextuality, Comm. Math. Phys. 334(2), 533-
628 (2015).
[24] J. Barrett, “Information processing in general-
ized probabilistic theories”, Phys. Rev. A 75,
032304 (2007).
[25] S. Abramsky and A. Brandenburger, “The sheaf-
theoretic structure of non-locality and contextual-
ity”, New J. Phys. 13, 113036 (2011).
[26] A. Fine, “Hidden Variables, Joint Probability,
and the Bell Inequalities”, Phys. Rev. Lett. 48,
291 (1982).
[27] R. Kunjwal, “Fine’s theorem, noncontextuality,
and correlations in Specker’s scenario”, Phys. Rev.
A 91, 022108 (2015).
[28] R. W. Spekkens, “The Status of Determinism
in Proofs of the Impossibility of a Noncontextual
Model of Quantum Theory”, Found. Phys. 44,
1125-1155 (2014).
[29] A. Cabello, “What do we learn about quantum
theory from Kochen-Specker quantum contextual-
ity?”, PIRSA:17070034 (2017).
[30] G. Chiribella and X. Yuan, “Measurement sharp-
ness cuts nonlocality and contextuality in ev-
ery physical theory”, arXiv:1404.3348 [quant-ph]
(2014).
[31] G. Chiribella and X. Yuan, “Bridging the gap
between general probabilistic theories and the
device-independent framework for nonlocality and
contextuality”, Information and Computation,
250, 15-49 (2016).
[32] R. Chaves and T. Fritz, “Entropic approach to
local realism and noncontextuality”, Phys. Rev. A
85, 032113 (2012).
[33] Tobias Fritz and Rafael Chaves, “Entropic In-
equalities and Marginal Problems”, IEEE Trans.
on Information Theory, vol. 59, pages 803 - 817
(2013).
[34] R. Kunjwal, “Hypergraph framework for irre-
ducible noncontextuality inequalities from log-
ical proofs of the Kochen-Specker theorem”,
arXiv:1805.02083 [quant-ph] (2018).
[35] A. Cabello, “Specker’s fundamental principle of
quantum mechanics”, arXiv:1212.1756 [quant-ph]
(2012).
[36] R. W. Spekkens, “Noncontextuality: how we
should define it, why it is natural, and what to
do about its failure”, PIRSA:17070035 (2017).
[37] M. D. Mazurek, M. F. Pusey, K. J. Resch,
and R. W. Spekkens, “Experimentally bound-
ing deviations from quantum theory in the
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 42
landscape of generalized probabilistic theories”,
arXiv:1710.05948 [quant-ph] (2017).
[38] M. F. Pusey, L. del Rio, and B. Meyer, “Contex-
tuality without access to a tomographically com-
plete set”, arXiv:1904.08699 (2019).
[39] Y. C. Liang, R. W. Spekkens, H. M. Wiseman,
“Specker’s parable of the overprotective seer: A
road to contextuality, nonlocality and complemen-
tarity”, Phys. Rep. 506, 1 (2011).
[40] R. Kunjwal and S. Ghosh, “Minimal state-
dependent proof of measurement contextuality for
a qubit”, Phys. Rev. A 89, 042118 (2014).
[41] R. Kunjwal, C. Heunen, and T. Fritz, “Quantum
realization of arbitrary joint measurability struc-
tures”, Phys. Rev. A 89, 052126 (2014).
[42] R. Kunjwal, “A note on the joint measurabil-
ity of POVMs and its implications for contextual-
ity”,arXiv:1403.0470 [quant-ph] (2014).
[43] S. Popescu and D. Rohrlich, “Quantum nonlocal-
ity as an axiom”, Found. Phys. 24, 379-385 (1994).
[44] T. Heinosaari, D. Reitzner, and P. Stano, “Notes
on Joint Measurability of Quantum Observables”,
Found. Phys. 38, 1133-1147 (2008).
[45] R. Kunjwal, “How to go from the KS theorem to
experimentally testable noncontextuality inequal-
ities”, PIRSA:17070059 (2017).
[46] Konrad Engel, “Sperner theory: Encyclopedia of
Mathematics and its Applications”, Vol. 65, Cam-
bridge University Press, Cambridge (1997).
[47] A. A. Klyachko, M. A. Can, S. Binicio˘glu, and
A. S. Shumovsky, “Simple Test for Hidden Vari-
ables in Spin-1 Systems”, Phys. Rev. Lett. 101,
020403 (2008).
[48] C. Held, “The Kochen-Specker Theorem”, The
Stanford Encyclopedia of Philosophy (Spring 2018
Edition), Edward N. Zalta (ed.).
[49] T. Gonda, R. Kunjwal, D. Schmid, E. Wolfe, and
A. B. Sainz, “Almost Quantum Correlations are
Inconsistent with Specker’s Principle”, Quantum
2, 87 (2018).
[50] M. Navascu´es, Y. Guryanova, M. J. Hoban,
and A. Ac´ın, “Almost quantum correlations”,
Nat. Commun. 6, 6288 (2015).
[51] A. Cabello, Adan, J. Estebaranz, and G. Garcia-
Alcaine, “Bell-Kochen-Specker theorem: A proof
with 18 vectors,” Phys. Lett. A 212, 183 (1996).
[52] E. G. Beltrametti and S. Bugajski, “A classical
extension of quantum mechanics”, J. Phys. A 28,
3329 (1995).
[53] X. Zhan, E. G. Cavalcanti, J. Li, Z. Bian,
Y. Zhang, H. M. Wiseman, and P. Xue, “Ex-
perimental generalized contextuality with single-
photon qubits”, Optica 4, 966-971 (2017).
[54] R. Kunjwal, “Contextuality beyond the Kochen-
Specker theorem”, arXiv:1612.07250 [quant-ph]
(2016).
[55] T. Fritz, A. B. Sainz, R. Augusiak, J. B. Brask,
R. Chaves, A. Leverrier, and A. Ac´ın, “Local or-
thogonality: a multipartite principle for correla-
tions”, Nat. Commun. 4, 2263 (2013).
[56] R. W. Spekkens, “Nonclassicality as the failure of
noncontextuality”, PIRSA:15050081 (2015) (see
the slide at 41:43 minutes).
[57] R. W. Spekkens, “Quasi-Quantization: Classi-
cal Statistical Theories with an Epistemic Re-
striction”, In: Chiribella G., Spekkens R. (eds)
Quantum Theory: Informational Foundations and
Foils. Fundamental Theories of Physics, vol 181.
Springer, Dordrecht.
[58] T. Vidick and S. Wehner, “Does Ignorance of the
Whole Imply Ignorance of the Parts? Large Vio-
lations of Noncontextuality in Quantum Theory”,
Phys. Rev. Lett. 107, 030402 (2011).
[59] R. Raussendorf, “Contextuality in measurement-
based quantum computation”, Phys. Rev. A 88,
022322 (2013).
[60] M. Howard, J. Wallman, V. Veitch, and J. Emer-
son, “Contextuality supplies the ‘magic’ for quan-
tum computation”, Nature 510, 351 (2014).
[61] N. Delfosse, P. A. Guerin, J. Bian, and
R. Raussendorf, “Wigner Function Negativity
and Contextuality in Quantum Computation on
Rebits”, Phys. Rev. X 5, 021003 (2015).
[62] J. Bermejo-Vega, N. Delfosse, D. E. Browne,
C. Okay, R. Raussendorf, “Contextuality as a re-
source for qubit quantum computation”, Phys.
Rev. Lett. 119, 120505 (2017).
[63] J. Singh, K. Bharti, and Arvind, “Quantum
key distribution protocol based on contextuality
monogamy”, Phys. Rev. A 95, 062333 (2017).
[64] A. Cabello, “Kochen-Specker Theorem for a Sin-
gle Qubit using Positive Operator-Valued Mea-
sures”, Phys. Rev. Lett. 90, 190401 (2003).
[65] A. Grudka and P. Kurzy´nski, “Is There Contex-
tuality for a Single Qubit?”, Phys. Rev. Lett. 100,
160401 (2008).
[66] P. Busch, “Quantum States and Generalized Ob-
servables: A Simple Proof of Gleason’s Theorem”,
Phys. Rev. Lett. 91, 120403 (2003).
[67] C. M. Caves, C. A. Fuchs, K. Manne, and
J. M. Renes, “Gleason-Type Derivations of the
Quantum Probability Rule for Generalized Mea-
surements”, Found. Phys. 34, 193 (2004).
[68] A. M. Gleason, “Measures on the closed sub-
spaces of a Hilbert space”, J. Math. Mech. 6, 885
(1957). Also available at JSTOR.
[69] P. K. Aravind, “The generalized Kochen-Specker
theorem”, Phys. Rev. A 68, 052104 (2003).
[70] A. A. Methot, “Minimal Bell-Kochen-Specker
proofs with POVMs on qubits”, Int. J. Quantum
Inf. 5, 353 (2007).
[71] Q. Zhang, H. Li, T. Yang, J. Yin, J. Du,
J. W. Pan, “Experimental Test of the Kochen-
Specker Theorem for Single Qubits using Pos-
itive Operator-Valued Measures”, arXiv:quant-
ph/0412049 (2004).
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 43
[72] L. Mancinska, G. Scarpa, and S. Severini,
“New Separations in Zero-Error Channel Capac-
ity Through Projective Kochen Specker Sets and
Quantum Coloring”, IEEE Transactions on Infor-
mation Theory 59, 4025 (2013).
[73] J. Henson and A. B. Sainz, “Macroscopic non-
contextuality as a principle for almost-quantum
correlations”, Phys. Rev. A 91, 042114 (2015).
[74] D. A. Meyer, “Finite Precision Measurement
Nullifies the Kochen-Specker Theorem”, Phys.
Rev. Lett. 83, 3751 (1999).
[75] A. Kent, “Noncontextual Hidden Variables and
Physical Measurements”, Phys. Rev. Lett. 83,
3755 (1999).
[76] R. Clifton and A. Kent, “Simulating quantum
mechanics by non-contextual hidden variables”,
Proc. R. Soc. Lond. A: Vol. 456, 2101-2114 (2000).
[77] J. Barrett and A. Kent, “Non-contextuality,
finite precision measurement and the
Kochen-Specker theorem”, Stud. Hist. Phi-
los. Mod. Phys. 35, 151 (2004).
[78] A. Winter, “What does an experimental test
of quantum contextuality prove or disprove?”, J.
Phys. A: Math. Theor. 47, 424031 (2014).
[79] V. B. Scholz and R. F. Werner, “Tsirelson’s
Problem”, arXiv:0812.4305 [math-ph] (2008).
[80] T. Fritz, “Tsirelson’s problem and Kirchberg’s
conjecture”, Rev. Math. Phys. 24 (5), 1250012
(2012).
Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 44