Fermat's Library | q-2019-09-09-184 annotated/explained version.

Beyond the Cabello-Severini-Winter framework: Making

sense of contextuality without sharpness of measurements

Ravi Kunjwal

Perimeter Institute for Theoretical Physics,

31 Caroline Street North, Waterloo, Ontario, Canada, N2L 2Y5.

September 4, 2019

We develop a hypergraph-theoretic frame-

work for Spekkens contextuality applied to

Kochen-Specker (KS) type scenarios that goes

beyond the Cabello-Severini-Winter (CSW)

framework. To do this, we add new

hypergraph-theoretic ingredients to the CSW

framework. We then obtain noise-robust non-

contextuality inequalities in this generalized

framework by applying the assumption of

(Spekkens) noncontextuality to both prepara-

tions and measurements. The resulting frame-

work goes beyond the CSW framework in both

senses, conceptual and technical. On the con-

ceptual level: 1) as in any treatment based on

the generalized notion of noncontextuality

a la

Spekkens, we relax the assumption of outcome

determinism inherent to the Kochen-Specker

theorem but retain measurement noncontex-

tuality, besides introducing preparation non-

contextuality, 2) we do not require the exclu-

sivity principle – that pairwise exclusive mea-

surement events must all be mutually exclu-

sive – as a fundamental constraint on mea-

surement events of interest in an experimen-

tal test of contextuality, given that this prop-

erty is not true of general quantum measure-

ments, and 3) as a result, we do not need to

presume that measurement events of interest

are “sharp” (for any deﬁnition of sharpness),

where this notion of sharpness is meant to im-

ply the exclusivity principle. On the techni-

cal level, we go beyond the CSW framework

in the following senses: 1) we introduce a

source events hypergraph – besides the mea-

surement events hypergraph usually consid-

ered – and deﬁne a new operational quantity

Corr that appears in our inequalities, 2) we de-

ﬁne a new hypergraph invariant – the weighted

max-predictability – that is necessary for our

analysis and appears in our inequalities, and 3)

our noise-robust noncontextuality inequalities

quantify tradeoﬀ relations between three oper-

ational quantities – Corr, R, and p

– only one

of which (namely, R) corresponds to the Bell-

Ravi Kunjwal: rkunjwal@perimeterinstitute.ca

Kochen-Specker functionals appearing in the

CSW framework; when Corr = 1, the inequal-

ities formally reduce to CSW type bounds on

R. Along the way, we also consider in detail

the scope of our framework vis-

a-vis the CSW

framework, particularly the role of Specker’s

principle in the CSW framework, i.e., what the

principle means for an operational theory sat-

isfying it and why we don’t impose it in our

framework.

Contents

1 Introduction 2

2 Spekkens framework 5

2.1 Operational theory . . . . . . . . . . . 5

2.2 Ontological model . . . . . . . . . . . 6

2.3 Representation of coarse-graining . . . 7

2.3.1 Coarse-graining of measurements 7

2.3.2 Coarse-graining of preparations 8

2.4 Joint measurability (or compatibility) 9

2.5 Noncontextuality . . . . . . . . . . . . 9

2.6 An example of Spekkens contextuality:

the fair coin ﬂip inequality . . . . . . . 10

2.7 Connection to Bell scenarios . . . . . . 12

3 Hypergraph approach to Kochen-

Specker scenarios in the Spekkens

framework 13

3.1 Measurements . . . . . . . . . . . . . . 13

3.1.1 Classiﬁcation of probabilistic

models . . . . . . . . . . . . . . 14

3.1.2 Distinguishing two conse-

quences of Specker’s principle:

Structural Specker’s principle

vs. Statistical Specker’s principle 15

3.1.3 What does it mean for an oper-

ational theory to satisfy struc-

tural/statistical Specker’s prin-

ciple? . . . . . . . . . . . . . . 16

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 1

arXiv:1709.01098v4 [quant-ph] 3 Sep 2019

3.1.4 Remark on the classiﬁcation of

probabilistic models: why we

haven’t deﬁned “quantum mod-

els” as those obtained from pro-

jective measurements . . . . . . 20

3.1.5 Scope of this framework . . . . 20

3.2 Sources . . . . . . . . . . . . . . . . . 21

4 A key hypergraph invariant: the

weighted max-predictability 23

5 Noise-robust noncontextuality inequali-

ties 24

5.1 Key notions from CSW . . . . . . . . 24

5.2 Key notion not from CSW:

source-measurement correlation, Corr 25

5.3 Obtaining the noise-robust noncontex-

tuality inequalities . . . . . . . . . . . 25

5.3.1 Expressing operational quanti-

ties in ontological terms . . . . 25

5.3.2 Derivation of the noncontextual

tradeoﬀ for any graph G . . . . 26

5.3.3 When is the noncontextual

tradeoﬀ violated? . . . . . . . . 27

5.4 Example: KCBS scenario . . . . . . . 27

6 Discussion 30

6.1 Measurement-measurement cor-

relations vs. source-measurement

correlations . . . . . . . . . . . . . . . 30

6.2 Can our noise-robust noncontextuality

inequalities be saturated by a noncon-

textual ontological model? . . . . . . . 30

6.2.1 The special case of facet-

deﬁning Bell-KS inequalities:

Corr=1 . . . . . . . . . . . . . 30

6.2.2 The general case: Corr < 1 . . 30

6.3 Can trivial POVMs ever violate these

noncontextuality inequalities? . . . . . 31

6.3.1 The case p ∈ C(Γ

) . . . . . . 31

6.3.2 The case p ∈

ConvHull(G(Γ

ind

) . . . . . . 31

6.3.3 The general case p ∈ G(Γ

) . . 31

7 Conclusions 32

Acknowledgments 33

A Status of KS-contextuality as an experi-

mentally testable notion of nonclassical-

ity for POVMs in quantum theory 33

A.1 Limitations of KS-contextuality vis-`a-

vis POVMs . . . . . . . . . . . . . . . 34

A.1.1 KS-contextuality for POVMs in

the literature . . . . . . . . . . 34

A.1.2 Classifying probabilistic mod-

els: restriction of quantum

models to PVMs . . . . . . . . 35

A.2 Robustness of Bell nonlocality vis-`a-vis

POVMs . . . . . . . . . . . . . . . . . 36

B Ontological models without respecting

coarse-graining relations 37

B.1 How to construct a “KS-

noncontextual” ontological model

of the KCBS experiment [47] without

coarse-graining relations . . . . . . . . 37

B.2 How to construct a “preparation and

measurement noncontextual” ontologi-

cal model without coarse-graining rela-

tions . . . . . . . . . . . . . . . . . . . 37

C Trivial POVMs 38

C.1 Bell-CHSH scenario . . . . . . . . . . 38

C.2 CHSH-type contextuality scenario: 4-

cycle . . . . . . . . . . . . . . . . . . . 38

D The KS-uncolourable hypergraph Γ

References 41

1 Introduction

To say that quantum theory is counterintuitive, or

that it requires a revision of our classical intuitions,

requires us to be mathematically precise in our def-

inition of these classical intuitions. Once we have a

precise formulation of such classicality, we can begin

to investigate those features of quantum theory that

power its nonclassicality, i.e., its departure from our

classical intuitions, and thus prove theorems about

such nonclassicality. To the extent that a physical

theory is provisional, likely to be replaced by a better

theory in the future, it also makes sense to articu-

late such notions of classicality in as operational a

manner as possible. By ‘operational’, we refer to a

formulation of the theory that takes the operations –

preparations, measurements, transformations – that

can be carried out in an experiment as primitives and

which speciﬁes the manner in which these operations

combine to produce the data in the experiment. Such

an operational formulation often suggests generaliza-

tions of the theory that can then be used to better

understand its axiomatics [1–3]. At the same time,

an operational formulation also lets us articulate our

notions of nonclassicality in a manner that is experi-

mentally testable and thus allows us to leverage this

nonclassicality in applications of the theory. Indeed,

a key area of research in quantum foundations and

quantum information is the development of methods

to assess nonclassicality in an experiment under min-

imal assumptions on the operational theory describ-

ing it. The paradigmatic example of this is the case of

Bell’s theorem and Bell experiments [4–11], where any

operational theory that is non-signalling between the

diﬀerent spacelike separated wings of the experiment

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 2

is allowed. The notion of classicality at play in Bell’s

theorem is the assumption of local causality: any non-

signalling theory that violates the assumption of local

causality is said to exhibit nonclassicality by the lights

of Bell’s theorem.

More recently, much work [12–17] has been devoted

to obtaining constraints on operational statistics that

follow from a generalized notion of noncontextuality

proposed by Spekkens [18]. This notion of classicality

[18] has its roots in the Kochen-Specker (KS) theo-

rem [19], a no-go theorem that rules out the possibility

that a deterministic underlying ontological model [20]

could reproduce the operational statistics of (projec-

tive) quantum measurements in a manner that sat-

isﬁes the assumption of KS-noncontextuality. KS-

noncontextuality is the notion of classicality at play

in the Kochen-Specker theorem. The Spekkens frame-

work abandons the assumption of outcome determin-

ism [18] – the idea that the ontic state of a system

ﬁxes the outcome of any measurement deterministi-

cally – that is intrinsic to KS-noncontextuality. It also

applies to general operational theories and extends

the notion of noncontextuality to general experimen-

tal procedures – preparations, transformations, and

measurements – rather than measurements alone.

Parallel to work along the lines of Spekkens [18],

work seeking to directly operationalize the Kochen-

Specker theorem (rather than revising the notion

of noncontextuality at play) culminated in two re-

cent approaches that classify theories by the de-

gree to which they violate the assumption of KS-

noncontextuality: the graph-theoretic framework of

Cabello, Severini, and Winter (CSW) [21, 22], where a

general approach to obtaining graph-theoretic bounds

on linear Bell-KS functionals was proposed, and the

related hypergraph framework of Ac´ın, Fritz, Lev-

errier, and Sainz (AFLS) [23], where an approach

to characterizing sets of correlations was proposed.

The CSW framework relates well-known graph in-

variants to: 1) upper bounds on Bell-KS inequali-

ties that follow from KS-noncontextuality, 2) upper

bounds on maximum quantum violations of these in-

equalities that can be obtained from projective mea-

surements, and 3) upper bounds on their violation

in general probabilistic theories [24] – denoted E1 –

which satisfy the “exclusivity principle” [22]. Com-

plementary to this, the AFLS framework uses graph

invariants in the service of deciding whether a given

assignment of probabilities to measurement outcomes

in a KS-contextuality experiment belongs to a partic-

ular set of correlations; they showed that membership

in the quantum set of correlations (deﬁned only for

projective measurements in quantum theory) cannot

be witnessed by a graph invariant, cf. Theorem 5.1.3

of Ref. [23]. Another recent approach due to Abram-

sky and Brandenburger [25] employs sheaf-theoretic

ideas to formulate KS-contextuality.

A key achievement of the frameworks of Refs. [22,

23, 25] is a formal uniﬁcation of Bell scenarios

with KS-contextuality scenarios, treating them on

the same footing. Indeed, the perspective there is

to consider Bell scenarios as a special case of KS-

contextuality scenarios. What is lost in this math-

ematical uniﬁcation, however, is the fact that Bell-

locality and KS-noncontextuality have physically dis-

tinct, if related, motivations. The physical situation

that Bell’s theorem refers to requires (at least) two

spacelike separated labs (where local measurements

are carried out) so that the assumption of local causal-

ity (or Bell-locality) can be applied.

On the other

hand, the physical situation that the Kochen-Specker

theorem refers to does not require spacelike separa-

tion as a necessary ingredient and one can there-

fore consider experiments in a single lab. However,

the assumption of KS-noncontextuality entails out-

come determinism [18], something not required by

local causality in Bell scenarios.

This diﬀerence

in the physical situation for the two kinds of ex-

periments is one of the reasons for generalizing KS-

noncontextuality to the notion of noncontextuality in

the Spekkens framework [18] (so that outcome deter-

minism is not assumed) while leaving Bell’s notion of

local causality untouched.

In the present paper we build a bridge from the

CSW approach, where KS-noncontextual correlations

are bounded by Bell-KS inequalities, to noise-robust

noncontextuality inequalities in the Spekkens frame-

work [18]. That is, we show how the constraints from

KS-noncontextuality in the framework of Ref. [22]

translate to constraints from generalized noncontex-

tuality in the framework of Ref. [18]. The resulting

operational criteria for contextuality `a la Spekkens

What do we mean by whether an assumption “can be ap-

plied”? Of course, mathematically, one can “apply” any as-

sumption one wants in the service of proving a theorem. But

insofar as the mathematics here is trying to model a real exper-

iment, the consistency of those assumptions with some essen-

tial facts of the experiment is the minimal requirement for any

no-go theorem derived from such assumptions to be physically

interesting. Hence, in the presence of signalling (implying the

absence of spacelike separation), it makes no sense to assume

local causality in a Bell experiment and derive the resulting

Bell inequalities: such an assumption on the ontological model

is already in conﬂict with the fact of signalling across the labs

and no Bell inequalities are needed to witness this fact. Bell in-

equalities only become physically interesting when the theories

being compared relative to them are all non-signalling: if the

experiment itself is signalling, any non-signalling description –

locally causal, quantum, or in a general probabilistic theory

(GPT) – is ipso facto ruled out.

Note that this assumption of outcome determinism doesn’t

aﬀect the conclusions in a Bell scenario even if one adopted

it because of Fine’s theorem [26]: a locally deterministic on-

tological model entails the same set of (Bell-local) correlations

as a locally causal ontological model. Relaxing outcome deter-

minism, however, doesn’t mean the same thing for the kinds

of experiments envisaged by the Kochen-Specker theorem – in

particular, it doesn’t mean that models satisfying factorizabil-

ity `a la Ref. [25] are the most general outcome-indeterministic

models – and thus considerations parallel to Fine’s theorem [26]

do not apply, cf. [27, 28].

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 3

are noise-robust and therefore applicable to arbi-

trary positive operator-valued measures (POVMs)

and mixed states in quantum theory. Note that the

insights gleaned from frameworks such as those of

Refs. [22, 23, 25] regarding Bell nonlocality require

no revision in our approach. It is only in the appli-

cation of such frameworks (in particular, the CSW

framework) to the question of contextuality that we

seek to propose an alternative hypergraph framework

(formalizing Spekkens contextuality [18]) that is more

operationally motivated for experimental situations

where one cannot appeal to spacelike separation to

justify locality of the measurements.

For Kochen-

Specker type experimental scenarios, we will con-

sider the twin notions of preparation noncontextu-

ality and measurement noncontextuality – taken to-

gether as a notion of classicality – to obtain noise-

robust noncontextuality inequalities that generalize

the KS-noncontextuality inequalities of CSW. These

inequalities witness nonclassicality even when quan-

tum correlations arising from arbitrary (i.e., possibly

nonprojective) quantum measurements on any quan-

tum state are allowed. A key innovation of this ap-

proach is that it treats all measurements in an oper-

ational theory on an equal footing. No deﬁnition of

“sharpness” [29–31] is needed to justify or derive non-

contextuality inequalities in this approach. Further-

more, if certain idealizations are presumed about the

operational statistics, then these inequalities formally

recover the usual Bell-KS inequalities `a la CSW. The

Bell-KS inequalities can be viewed as an instance of

the classical marginal problem [25–27, 32, 33], i.e.,

as constraints on the (marginal) probability distri-

butions over subsets of a set of observables that fol-

low from requiring the existence of global joint prob-

ability distribution over the set of all observables.

Since the Bell-KS inequalities are only recovered un-

der certain idealizations, but not otherwise, the noise-

robust noncontextuality inequalities we obtain can-

not in general be viewed as arising from a classical

marginal problem. Hence, they cannot be understood

within existing frameworks that rely on this (reduc-

tion to the classical marginal problem) property to

formally unify the treatment of Bell-nonlocality and

KS-contextuality [22, 23, 25]. This is a crucial dis-

tinction relative to the usual Bell-KS inequality type

witnesses of KS-contextuality.

This paper is based on a previous contribution [16]

that laid the conceptual groundwork for the progress

we make here. Besides the noise-robust noncontex-

tuality inequalities that generalize constraints from

KS-noncontextuality in the CSW framework leverag-

ing the graph invariants of CSW [22] (cf. Section 5),

Nor the sharpness of the measurements to justify outcome

determinism. We discuss these issues in detail – in particular,

the physical basis of KS-noncontextuality vis-`a-vis Bell-locality

and how that inﬂuences our framework – in Appendix A for the

interested reader.

the contributions of this paper also include:

• An exposition of Specker’s principle and how dif-

ferent implications of it (e.g., consistent exclu-

sivity [23]) for a given operational theory arise in

the hypergraph framework (cf. Sections 3.1.2 and

3.1.3), in particular the results in Theorems 1, 2,

and Corollary 1.

• Introduction of a hypergraph invariant – the

weighted max-predictability – that is key to

our noise-robust noncontextuality inequalities,

cf. Section 4. This invariant is also key to the

hypergraph framework of Ref. [34] which is com-

plementary to the present framework.

• A detailed discussion of how KS-

noncontextuality for POVMs has been previously

treated in the literature and the limitations of

those treatments, cf. Appendices A and C.

Also, unlike for the case of KS-noncontextuality

inequalities, we show that trivial POVMs can

never violate our noise-robust noncontextuality

inequalities, cf. Section 6.3.

• A discussion of coarse-graining relations in Sec-

tion 2.3 and their importance for contextual-

ity no-go theorems, in particular a discussion of

ontological models that do not respect coarse-

graining relations in Appendix B. We show,

in Appendix B, how relaxing the constraint

from coarse-graining relations on an ontological

model renders either notion of noncontextuality

– whether Kochen-Specker [19] or Spekkens [18]

– vacuous.

• A discussion, by example, of why our generaliza-

tion of the CSW framework cannot accommodate

contextuality scenarios that are KS-uncolourable

in Appendix D and why one needs a distinct

framework, i.e., the framework of Ref. [34], to

treat KS-uncolourable scenarios.

The structure of this paper follows: Section 2 reviews

the Spekkens framework for generalized noncontextu-

ality [18]. Section 3 introduces a hypergraph frame-

work that shares features of traditional frameworks

for KS-contextuality [22, 23] but is also augmented

(relative to these traditional frameworks) with the in-

gredients necessary for obtaining noise-robust noncon-

textuality inequalities. In particular, its subsections

3.1.2 and 3.1.3 discuss Specker’s principle [35] and

deﬁne its diﬀerent implications for contextuality sce-

narios `a la Ref. [23]. Section 4 deﬁnes a new hyper-

graph invariant – the weighted max-predictability –

that we need later on as a crucial new ingredient in

our inequalities. Section 5 obtains noise-robust non-

contextuality inequalities within the framework de-

ﬁned in Section 3 and using the hypergraph invariant

of Section 4 in addition to two graph invariants from

the CSW framework [22]. These inequalities can be

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 4

seen as special cases of the general approach outlined

in Ref. [16]. In Section 6, we include discussions on

various features of our noise-robust noncontextuality

inequalities, in particular the fact that trivial POVMs

can never violate them. Section 7 concludes with some

open questions and directions for future research.

2 Spekkens framework

We concern ourselves with prepare-and-measure ex-

periments. A schematic of such an experiment is

shown in Figure 1 where, for the sake of simplicity, we

imagine a single source device that can perform any

preparation procedure of interest (rather than a col-

lection of source devices, each implementing a partic-

ular preparation procedure) and a single measurement

device that can perform any measurement procedure

of interest (rather than a collection of measurement

devices, each implementing a particular measurement

procedure). Note that this is just a conceptual ab-

straction: in particular, the various possible measure-

ment settings on the measurement device may, for ex-

ample, correspond to incompatible measurement pro-

cedures in quantum theory. The fact that we repre-

sent the diﬀerent measurement settings by choices of

knob settings M ∈ M on a single measurement de-

vice does not mean that it’s physically possible to im-

plement all the measurement procedures represented

by M jointly; it only means that the experimenter

can choose to implement any of the measurements in

the set M in a particular prepare-and-measure exper-

iment. The same is true for our abstraction of prepa-

ration procedures to knob settings (S ∈ S) and out-

comes (s ∈ V

) of a single source device: it’s not that

the same device can physically implement all possible

preparation procedures; it’s just that an experimenter

can choose to implement any procedure in the set S

in a particular prepare-and-measure experiment.

We will consider two levels of description of

prepare-and-measure experiments represented by

Fig. 1: operational and ontological. The operational

description will be speciﬁed by an operational theory

that takes source and measurement devices as primi-

tives and describes the experiment solely in terms of

the probabilities associated to their input/output be-

haviour. The ontological description will be speciﬁed

by an ontological model that takes the system that

passes between the source and measurement devices

as primitive and describes the experiment in terms of

probabilities associated to properties of this system,

deriving the operational description as a consequence

of coarse-graining over these properties. Let us look

at each description in turn.

2.1 Operational theory

We now describe the various components of Fig. 1 in

more detail. The source device has a source setting

Measurement

Source

Figure 1: A prepare-and-measure experiment.

labelled by S that can be chosen from a set S. The

set S represents, in general, some subset of the set of

all source settings, S , that are admissible in the op-

erational theory, i.e., S ⊆ S . In a particular prepare-

and-measure experiment, S will typically be a ﬁnite

set of source settings. Choosing the setting S pre-

pares a system according to an ensemble of prepara-

tion procedures, denoted {(p(s|S), P

[s|S]

)}

s∈V

, where

{p(s|S)}

s∈V

is a probability distribution over the

preparation procedures {P

[s|S]

}

s∈V

in the ensemble.

This means that the source device has one classical

input S and two outputs: one output is a classical

label s ∈ V

identifying the preparation procedure

(in the ensemble {(p(s|S), P

[s|S]

)}

s∈V

) that is car-

ried out when source outcome s is observed for source

setting S (this source event is denoted [s|S]), and the

other output is a system prepared according to the

source event [s|S], i.e., preparation procedure P

[s|S]

with probability p(s|S). Thus, the assemblage of pos-

sible ensembles that the source device can prepare can

be denoted by {{(p(s|S), P

[s|S]

)}

s∈V

}

S∈S

On the other hand, the measurement device has

two inputs, one a classical input M ∈ M specifying

the choice of measurement setting to be implemented,

and the other input receives the system prepared ac-

cording to prepartion procedure P

[s|S]

and on which

this measurement M is carried out. The measurement

device has one classical output m ∈ V

denoting the

outcome of the measurement M implemented on a

system prepared according to P

[s|S]

, and which occurs

with probability p(m|M, S, s). The set M represents,

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 5

in general, some subset of the set of all measurement

settings, M , that are admissible in the operational

theory, i.e., M ⊆ M . In a particular prepare-and-

measure experiment, M will typically be a ﬁnite set

of measurement settings.

We will be interested in the operational joint prob-

ability p(m, s|M, S) ≡ p(m|M, S, s)p(s|S) for this

prepare-and-measure experiment for various choices

of M ∈ M, S ∈ S. Note how this operational de-

scription takes as primitive the operations carried out

in the lab and restricts itself to specifying the prob-

abilities of classical outcomes (i.e., m, s) given some

interventions (i.e., classical inputs, M, S). So far, we

haven’t assumed any structure on the operational the-

ory describing the schematic of Fig. 1 beyond the fact

that it is a catalogue of input/output probabilities

{{p(m, s|M, S) ∈ [0, 1]}

m∈V

,s∈V

}

M∈M,S∈S

for various interventions S ∈ S and M ∈ M that we

will consider in a prepare-and-measure experiment.

We now require more structure in the operational the-

ory underlying this experiment, beyond a mere spec-

iﬁcation of these probabilities.

We require that the operational theory admits

equivalence relations that partition experimental pro-

cedures of any type, whether preparations or measure-

ments, into equivalence classes of that type. These

equivalence relations are deﬁned relative to the op-

erational probabilities (not necessarily restricted to a

particular prepare-and-measure experiment) that are

admissible in the theory. We will call these equiv-

alence relations “operational equivalences”, in keep-

ing with standard terminology [18]. This means that

any distinctions of labels between procedures in an

equivalence class of procedures do not aﬀect the oper-

ational probabilities associated with the procedures.

We specify these equivalence relations for measure-

ment and preparation procedures below.

Two measurement events [m|M] and [m

] are

said to be operationally equivalent, denoted [m|M ] '

], if there exists no source event in the opera-

tional theory that can distinguish them, i.e.,

p(m, s|M, S) = p(m

, s|M

, S) ∀[s|S], s ∈ V

, S ∈ S .

(1)

Note that the statistical indistinguishability of [m|M]

and [m

] must hold for all possible source settings

S in the operational theory, not merely the source

settings S that are of direct interest in a particular

prepare-and-measure experiment. Hence, the “dis-

tinction of labels”, [m|M] or [m

], is empirically

inconsequential since the two procedures are, in prin-

ciple, indistinguishable by the lights of the operational

theory.

Similarly, two source events [s|S] and [s

] are said

to be operationally equivalent, denoted [s|S] ' [s

if there exists no measurement event in the opera-

tional theory that can distinguish them, i.e.,

p(m, s|M, S) = p(m, s

|M, S

∀[m|M], m ∈ V

, M ∈ M . (2)

Again, the statistical indistinguishability of [s|S] and

] must hold for all possible measurement settings

M , not merely those (i.e., M) that are of direct inter-

est in a particular prepare-and-measure experiment.

Similar to measurement events, the “distinction of la-

bels”, [s|S] or [s

], is empirically inconsequential

since the two procedures are, in principle, indistin-

guishable by the lights of the operational theory.

Given this equivalence structure for preparation

and measurement procedures in the operational the-

ory, we can now formalize the notion of a context:

Deﬁnition 1. A context is any distinction of labels

between operationally equivalent procedures in the op-

erational theory.

To see concrete examples of the kinds of contexts

that will be of interest to us in this paper, con-

sider quantum theory. Any mixed quantum state ad-

mits multiple convex decompositions in terms of other

quantum states, i.e., it can be prepared by coarse-

graining over distinct ensembles of quantum states,

each ensemble denoted by a diﬀerent label. In this

case, the “distinction of labels” between diﬀerent de-

compositions denotes a distinction of preparation en-

sembles, which instantiates our notion of a prepara-

tion context. Similarly, a given positive operator can

be implemented by diﬀerent positive operator-valued

measures (POVMs), and the distinction of labels de-

noting these diﬀerent POVMs instantiates our notion

of a measurement context.

2.2 Ontological model

Given the operational description of the experiment

in terms of probabilities p(m, s|M, S), we want to

explore the properties of any underlying ontological

model for this operational description. Any such on-

tological model, deﬁned within the ontological mod-

els framework [20], takes as primitive the physical

system (rather than operations on it) that passes

between the source and measurement devices, i.e.,

its basic objects are ontic states of the system, de-

noted λ ∈ Λ, that represent intrinsic properties of

the physical system. When a preparation proce-

dure [s|S] is carried out, the source device samples

from the space of ontic states Λ according to a prob-

ability distribution {µ(λ|S, s) ∈ [0, 1]}

λ∈Λ

, where

λ∈Λ

µ(λ|S, s) = 1, and the joint distribution over

s and λ given S, i.e., {µ(λ, s|S)}

λ∈Λ

, is given by

µ(λ, s|S) ≡ µ(λ|S, s)p(s|S). On the other hand, when

a system in ontic state λ is input to the measure-

ment device with measurement setting M ∈ M, the

probability distribution over the measurement out-

comes is given by {ξ(m|M, λ) ∈ [0, 1]}

m∈V

, where

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 6

m∈V

ξ(m|M, λ) = 1. The operational statistics

{{p(m, s|M, S) ∈ [0, 1]}

m∈V

,s∈V

}

M∈M,S∈S

results from a coarse-graining over λ, i.e.,

p(m, s|M, S) =

λ∈Λ

ξ(m|M, λ)µ(λ, s|S), (3)

for all m ∈ V

, s ∈ V

, M ∈ M, S ∈ S.

Note that the deﬁnition of an ontological model

above extends to the deﬁnition of an ontological model

of the operational theory (as opposed to a particular

fragment of the theory representing the experiment)

when we take M = M and S = S .

2.3 Representation of coarse-graining

We will now specify how coarse-graining of procedures

in a prepare-and-measure experiment is represented

in its description, whether operational or ontological.

Namely, if a procedure is deﬁned as a coarse-graining

of other procedures, then we require that the repre-

sentation of such a procedure is deﬁned by the same

coarse-graining of the representation of the other pro-

cedures.

Implicit in this discussion is the assump-

tion that the operational theory allows one to deﬁne

new procedures in the set M or S by coarse-graining

other procedures in these sets, i.e., both M and S

are closed under coarse-grainings. In particular, one

can consider coarse-graining measurement and source

settings (belonging to sets M and S, respectively) ac-

tually implemented in the lab to deﬁne new measure-

ment and source settings that belong to M \M and

S \S, respectively.

2.3.1 Coarse-graining of measurements

Let us see how this works for the case of measurement

procedures: if a measurement procedure M with mea-

surement events {[m|M ]}

m∈V

is deﬁned as a coarse-

graining of another measurement procedure

M with

measurement events {[ ˜m|

M]}

˜m∈V

, symbolically de-

noted by

[m|M] ≡

˜m

p(m|˜m)[ ˜m|

M],

where ∀m, ˜m : p(m|˜m) ∈ {0, 1},

p(m|˜m) = 1,

(4)

Quantum theory is an example of an operational theory

that satisﬁes this requirement because of the linearity of the

Born rule with respect to both preparations and measurements.

The same is true, more generally, of general probabilistic theo-

ries (GPTs) [1, 24]. We require this feature in any ontological

model as well, regardless of its (non)contextuality.

Similarly, we also allow probabilistic mixtures of (prepa-

ration or measurement) procedures in the operational theory

to deﬁne new procedures, i.e., the theory is convex. See the

last paragraph of Section 2.5 for the role of this convexity in

experimental tests of contextuality and Section 2.6 for an ex-

ample where a probabilistic mixture of measurement settings

is required in a proof of contextuality.

then its representation in the operational description

as well as in the ontological description satisﬁes this

coarse-graining relation.

More explicitly, the coarse-

graining relation of Eq. (4) denotes the following post-

processing of

M: for each m ∈ V

, relabel each

outcome ˜m ∈ V

to outcome m with probability

p(m|˜m) ∈ {0, 1}; the logical disjunction of those ˜m

which are relabelled to m with probability 1 then de-

ﬁnes the measurement event [m|M ]. Now, in the op-

erational theory, this post-processing is represented

∀[s|S], where s ∈ V

, S ∈ S :

p(m, s|M, S) ≡

˜m

p(m|˜m)p( ˜m, s|

M, S), (5)

and in the ontological model it is represented by

∀λ ∈ Λ : ξ(m|M, λ) ≡

˜m

p(m|˜m)ξ( ˜m|

M, λ). (6)

As an example, consider a three-outcome measure-

ment

M with outcomes ˜m ∈ {1, 2, 3}, which can

be classically post-processed to obtain a two-outcome

measurement M with outcomes m ∈ {0, 1}, such that

p(m = 0|˜m = 1) = p(m = 0|˜m = 2) = 1 and

p(m = 1|˜m = 3) = 1. The measurement events of

M are then just

[m = 0|M ] ≡ [ ˜m = 1|

M] + [ ˜m = 2|

M], (7)

[m = 1|M ] ≡ [ ˜m = 3|

M], (8)

where the “+” sign denotes (just as the summation

disjunction, i.e., measurement event [m = 0|M] is said

to occur when [ ˜m = 1|

M] or [ ˜m = 2|

M] occurs. The

operational and ontological representations of these

measurement events are then given by

∀[s|S], where s ∈ V

, S ∈ S :

p(m = 0, s|M, S) ≡

˜m=1

p( ˜m, s|

M, S), (9)

p(m = 1, s|M, S) ≡ p( ˜m = 3, s|

M, S), (10)

∀λ ∈ Λ :

ξ(m = 0|M, λ) ≡

˜m=1

ξ( ˜m|

M, λ), (11)

ξ(m = 1|M, λ) ≡ ξ( ˜m = 3|

M, λ). (12)

This requirement on the representation of coarse-

graining of measurements is particularly important

(and often implicit) when the notion of a mea-

surement context is instantiated by compatibility

Note that Eq. (4) is not an operational equivalence between

independent procedures. It is a deﬁnition of a new procedure

obtained by coarse-graining another procedure.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 7

(or joint measurability), as in the case of KS-

contextuality, where one needs to consider coarse-

grainings of distinct measurements. For example,

consider a measurement setting M

with outcomes

, m

) ∈ V

× V

that is coarse-grained over

to deﬁne an eﬀective measurement setting M

(2)

with measurement events {[m

(2)

]}

∈V

. Sym-

bolically, [m

(2)

] ≡

[(m

, m

)|M

], which

is represented in the operational theory as ∀[s|S] :

p(m

, s|M

(2)

, S) ≡

p((m

, m

), s|M

, S) and

in the ontological model as ∀λ : ξ(m

(2)

, λ) ≡

ξ((m

, m

)|M

, λ). Similarly, consider another

measurement setting M

with outcomes (m

, m

) ∈

× V

that is coarse-grained over m

to de-

ﬁne an eﬀective measurement setting M

(3)

with

measurement events {[m

(3)

]}

∈V

. Symboli-

cally, [m

(3)

] ≡

[(m

, m

)|M

], which is

represented in the operational theory as ∀[s|S] :

p(m

, s|M

(3)

, S) ≡

p((m

, m

), s|M

, S) and

in the ontological model as ∀λ : ξ(m

(3)

, λ) ≡

ξ((m

, m

)|M

, λ).

Now, imagine that the following oper-

ational equivalence holds at the opera-

tional level: [m

(2)

] ' [m

(3)

]. KS-

noncontextuality is then the assumption that

ξ((m

, m

)|M

, λ) =

ξ((m

, m

)|M

, λ)

(i.e., ξ(m

(2)

, λ) = ξ(m

(3)

, λ)) for all λ and

that ξ((m

, m

)|M

, λ), ξ((m

, m

)|M

, λ) ∈ {0, 1}

for all λ. This assumption applied to multiple

(compatible) subsets of a set of carefully chosen

measurements can then provide a proof of the KS

theorem, i.e., there exist sets of measurements in

quantum theory such that their operational statis-

tics cannot be emulated by a KS-noncontextual

ontological model.

The key point here is this: the requirement that

coarse-graining relations between measurements be

respected by their representations in the ontological

model is independent of the KS-(non)contextuality of

the ontological model.

However, this requirement is

necessary for the assumption of KS-noncontextuality

to produce a contradiction with quantum theory; on

the other hand, a KS-contextual ontological model

(while respecting the coarse-graining relations) can

always emulate quantum theory. In this sense, the

representation of coarse-grainings is baked into an on-

tological model from the beginning (just as it is baked

into an operational description), before any claims

about its (non)contextuality.

In our example, this requirement has to do with the deﬁni-

tions of ξ(m

(2)

, λ) and ξ(m

(3)

, λ), not their ontological

equivalence. The ontological equivalence only comes into play

when invoking KS-noncontextuality.

One could, of course, choose to not respect the coarse-

graining relations and deﬁne a notion of an ontological model

without them. In such a model, one could treat every mea-

2.3.2 Coarse-graining of preparations

Let us now consider the representation of coarse-

grainings for preparation procedures. This works

in a way similar to the case of measurement proce-

dures which we have already outlined. If an ensemble

of source events {[s|S]}

s∈V

is deﬁned as a coarse-

graining of another ensemble, {[˜s|

S]}

˜s∈V

, symboli-

cally denoted as

[s|S] ≡

˜s

p(s|˜s)[˜s|

S], where

∀s, ˜s : p(s|˜s) ∈ {0, 1},

p(s|˜s) = 1, (13)

then its representation should satisfy the same coarse-

graining relation in any description, operational or

ontological. More explicitly, this coarse-graining de-

notes the following post-processing: for any s ∈ V

relabel each outcome ˜s ∈ V

to outcome s with prob-

ability p(s|˜s) ∈ {0, 1}; the logical disjunction of those

˜s which are relabelled to s with probability 1 then de-

ﬁnes the source event [s|S]. Now, in the operational

theory, this coarse-graining is represented by

∀[m|M], where m ∈ V

, M ∈ M :

p(m, s|M, S) ≡

˜s

p(s|˜s)p(m, ˜s|M,

S), (14)

and in the ontological model it is represented by

∀λ ∈ Λ : µ(λ, s|S) ≡

˜s

p(s|˜s)µ(λ, ˜s|

S). (15)

In this paper, we will focus on a speciﬁc type of coarse-

graining: namely, completely coarse-graining over the

outcomes of a source setting, say {[˜s|

S]}

˜s∈V

, to yield

an eﬀective one-outcome source-setting, denoted

associated with a single source event {[>|

]}, where

[>|

] ≡

˜s

[˜s|

S]. In the operational theory, this

coarse-graining is represented by

∀[m|M], where m ∈ V

, M ∈ M :

p(m, >|M,

) ≡

˜s

p(m, ˜s|M,

S), (16)

and in the ontological model it is represented by

∀λ ∈ Λ : µ(λ, >|

) ≡

˜s

µ(λ, ˜s|

S). (17)

surement obtained by coarse-graining another (parent) mea-

surement as a fundamentally new measurement with response

functions not respecting the coarse-graining relations with the

parent measurement’s response functions, even if such coarse-

graining relations are respected in the operational description.

Such an ontological model, however, will not be able to ar-

ticulate the ingredients needed for a proof of the KS theorem

and we will not consider it here. Indeed, in the absence of

the requirement that coarse-graining relations be respected in

an ontological model, one can easily construct an ontological

model that is “KS-noncontextual” for any operational theory.

The interested reader may look at Appendix B for more details,

perhaps after looking at Section 2.5 for the relevant deﬁnitions

of noncontextuality.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 8

Hence, we use the notation [>|

] to denote the

source event that “at least one of the source outcomes

in the set V

occurs for source setting

S” (i.e., the

logical disjunction of ˜s ∈ V

), formally denoting the

choice of

S and the subsequent coarse-graining over ˜s

by the “source setting”

and the deﬁnite outcome

of this source setting by “>”. This source event al-

ways occurs, i.e., p(>|

) = 1, so p(m, >|M,

) =

p(m|M,

, >) and µ(λ, >|

) = µ(λ|

, >).

This notion of coarse-graining over all the outcomes

of a source setting allows us to deﬁne a notion of

operational equivalence between the source settings

themselves. More precisely, two source settings S and

are said to be operationally equivalent, denoted

[>|S

] ' [>|S

], if no measurement event can distin-

guish them once all their outcomes are coarse-grained

over, i.e.,

s∈V

p(m, s|M, S) =

∈V

p(m, s

|M, S

)

∀[m|M], m ∈ V

, M ∈ M . (18)

In quantum theory, this would correspond

to the operational equivalence

p(s|S)ρ

[s|S]

p(s

)ρ

]

for the density operator obtained

by completely coarse-graining over two distinct en-

sembles of quantum states, {(p(s|S), ρ

[s|S]

)}

s∈V

and

{(p(s

), ρ

]

)}

∈V

on some Hilbert space H.

2.4 Joint measurability (or compatibility)

A given measurement procedure, {[m|M]}

m∈V

for

some M ∈ M , in the operational description can

be coarse-grained in many diﬀerent ways to deﬁne

new eﬀective measurement procedures. The coarse-

grained measurement procedures thus obtained from

{[m|M]}

m∈V

are then said to be jointly measurable

(or compatible), i.e., they can be jointly implemented

by the same measurement procedure {[m|M ]}

m∈V

which we refer to as their parent or joint measure-

ment. Formally, a set C of measurement procedures

{{[m

]}

∈V



i ∈ {1, 2, 3, . . . , |C|}}

is said to be jointly measurable (or compatible) if it

arises from coarse-grainings of a single measurement

procedure M ∈ M , i.e., for all {[m

]}

∈V

∈ C

] ≡

m∈V

p(m

|m)[m|M], (19)

where for all i, m, m

: p(m

|m) ∈ {0, 1} and

∈V

p(m

|m) = 1. In terms of the operational

probabilities, this means that

∀[s|S], s ∈ V

, S ∈ S and ∀{[m

]}

∈V

∈ C :

p(m

, s|M

, S) ≡

m∈V

p(m

|m)p(m, s|M, S). (20)

If, on the other hand, a set of measurement proce-

dures cannot arise from coarse-grainings of any single

measurement procedure, then the measurement pro-

cedures in the set are said to be incompatible, i.e.,

they cannot be jointly implemented.

Note that we will also often refer to a measurement

procedure {[m

]}

∈V

by just its measurement

setting, M

, and thus speak of the (in)compatibility

of measurement settings. Another notion that we will

need to refer to is the joint measurability of measure-

ment events: a set of measurement events that arise

as outcomes of a single measurement setting are said

to be jointly measurable, e.g., all the measurement

events in {[m|M ]}

m∈V

are jointly measurable since

they arise as outcomes of a single measurement set-

ting M.

As a quantum example, consider a commuting pair

of projective measurements, say {Π

, I − Π

} and

{Π

, I − Π

}, where Π

and Π

are projectors on

some Hilbert space H such that Π

= Π

and

I is the identity operator on H. This pair is jointly

implementable since they can be obtained by coarse-

graining the outcomes of the joint projective measure-

ment given by {Π

, Π

(I − Π

), (I − Π

)Π

, (I −

)(I − Π

)}.

2.5 Noncontextuality

It is always possible to build an ontological model

reproducing the predictions of any operational the-

ory, while respecting the coarse-graining relations.

A trivial example of such an ontological model is one

where ontic states λ are identiﬁed with the prepara-

tion procedures P

[s|S]

(where s ∈ V

and S ∈ S )

and we have µ(λ, s|S) ≡ δ

λ,λ

[s|S ]

p(s|S), where ontic

state λ

[s|S]

is the one deterministically sampled by the

preparation procedure P

[s|S]

. Further, the response

functions are identiﬁed with operational probabili-

ties as ξ(m|M, λ

[s|S]

) ≡ p(m|M, S, s). Then we have

λ∈Λ

ξ(m|M, λ)µ(λ, s|S) = ξ(m|M, λ

[s|S]

)p(s|S) =

p(m, s|M, S). Also, coarse-graining relations of the

type [ ˜m|

M] ≡

p( ˜m|m)[m|M] and [˜s|

S] ≡

p(˜s|s)[s|S] that are respected in the operational

description are also respected in this ontological de-

scription: that is, we have ∀λ ∈ Λ : ξ( ˜m|

M, λ) ≡

p( ˜m|m)ξ(m|M, λ) and ∀λ ∈ Λ : µ(λ, ˜s|

S) ≡

p(˜s|s)µ(λ, s|S).

Hence, it is only when additional assumptions are

imposed on an ontological model that deciding its ex-

istence becomes a nontrivial problem. Such additional

assumptions must, of course, play an explanatory role

to be worth investigating. The assumption we are in-

terested in is noncontextuality, applied to both prepa-

ration and measurement procedures. Motivated by

the methodological principle of the identity of indis-

cernables [18], noncontextuality is an inference from

Note that we will always assume coarse-graining relations

are respected in any ontological model. The exception is (some

of) the discussion in Section 2.3 and Appendix B where we

consider the alternative possibility.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 9

the operational description to the ontological descrip-

tion of an experiment. It posits that the equivalence

structure in the operational description is preserved

in the ontological description, i.e., the reason one

cannot distinguish two operationally equivalent rep-

resentations of procedures based on their operational

statistics is that there is, ontologically, no diﬀerence

in their representations. We now formally deﬁne the

notion of noncontextuality in its generalized form due

to Spekkens [18].

Mathematically, the assumption of measurement

noncontextuality entails that

[m|M] ' [m

]

⇒ ξ(m|M, λ) = ξ(m

, λ), ∀λ ∈ Λ, (21)

while the assumption of preparation noncontextuality

entails that

[s|S] ' [s

] ⇒ µ(λ, s|S) = µ(λ, s

) ∀λ ∈ Λ,

[>|S

] ' [>|S

] ⇒ µ(λ|S) = µ(λ|S

) ∀λ ∈ Λ.(22)

Here we denote µ(λ|S) ≡

s∈V

µ(λ, s|S), etc., for

simplicity of notation, rather than use the notation

µ(λ, >|S

), etc., for these coarse-grained probability

distributions. Note that since coarse-grainings are re-

spected in any ontological model we consider, we in-

deed have that µ(λ, >|S

) ≡

s∈V

µ(λ, s|S).

These are the assumptions of noncontextuality –

termed universal noncontextuality – that form the ba-

sis of our approach to noise-robust noncontextuality

inequalities [12–17, 45]. Note that the traditional

notion of KS-noncontextuality entails, besides mea-

surement noncontextuality above, the assumption of

outcome-determinism, i.e., for any measurement event

[m|M], ξ(m|M, λ) ∈ {0, 1} for all λ ∈ Λ.

It is important to note that in order for our no-

tion of operational equivalence to be experimentally

testable, we need that each of sets M and S includes

a tomographically complete set of measurements and

preparations, respectively. That is, the prepare-and-

measure experiment testing contextuality can probe

a tomographically complete set of preparations and

measurements. Of course, the set of all possible mea-

surements in a theory (M ) is (by deﬁnition) tomo-

graphically complete for any preparation in the the-

ory and, similarly, the set of all possible preparations

(S ) in a theory is tomographically complete for any

measurement in the theory. However, there may exist

smaller (ﬁnite) sets of preparations and measurements

in the theory that are tomographically complete and

in that case we require that S and M include such

tomographically complete sets, even if they don’t in-

clude all possible preparations and measurements in

the theory. For example, when the operational theory

is quantum theory for a qubit, the three spin measure-

ments {σ

, σ

} are tomographically complete for

any qubit preparation, so we require that M includes

these three measurements even if it doesn’t include ev-

ery other possible measurement on a qubit. While the

requirement that S and M include tomographically

complete sets doesn’t directly reﬂect in our theoreti-

cal derivation of the noise-robust noncontextuality in-

equalities later, it is crucial for experimentally verify-

ing the operational equivalences (cf. Eqs. (1),(2),(18))

we need to even invoke the assumption of noncontex-

tuality (cf. Eqs. (21),(22)). Further, this assumption

on M and S has so far been necessary to be able to

implement an actual noise-robust contextuality exper-

iment [13], besides the requirement that the opera-

tional theory be convex, i.e., probabilistic mixtures

of procedures in the theory (whether preparations or

measurements) are also valid procedures in the the-

ory. We refer the reader to Refs. [13, 36, 37] for a

discussion of what tomographic completeness entails

for (convex) operational theories formalized as general

probabilistic theories (GPTs). Although we will not

discuss it in this paper, see Ref. [38] for some recent

work towards relaxing the tomographic completeness

requirement for the set of measurement settings.

2.6 An example of Spekkens contextuality: the

fair coin ﬂip inequality

We recap here an example of Spekkens contextual-

ity that has been experimentally demonstrated [13]

to give the reader a ﬂavour of the general approach

we are going to adopt in the rest of this paper with

regard to Kochen-Specker type scenarios. We call the

inequality tested in Ref. [13] the “fair coin ﬂip” in-

equality.

Consider a prepare-and-measure scenario with

three source settings, denoted S ≡ {S

, S

}, such

that V

≡ {0, 1} and we have p(s

= 0|S

) = p(s

1|S

) = 1/2 for all i ∈ {1, 2, 3}. Each S

thus cor-

responds to the ensemble of preparation procedures

{(p(s

), P

]

)}

∈V

and we have the following

operational equivalence among the source settings af-

ter coarse-graining:

[>|S

] ' [>|S

]. (23)

There are four measurement settings in this sce-

nario, denoted M ≡ {M

, M

fcf

}, such that

∈ {0, 1} for all i ∈ {1, 2, 3, fcf}. The measure-

ment setting M

fcf

is a fair coin ﬂip, i.e., it is in-

sensitive to the preparation procedure preceding it

and yields the outcome m

fcf

= 0 or 1 with equal

probability for any preparation procedure P

[s|S]

, i.e.,

p(m

fcf

= 0|M

fcf

, S, s) = p(m

fcf

= 1|M

fcf

, S, s) = 1/2

for all [s|S].

We also deﬁne a measurement procedure M

mix

as a

classical post-processing of M

, M

, i.e., its mea-

surement events {[m

mix

]}

mix

are deﬁned by

the classical post-processing relation

mix

] ≡

i=1

p(i)

p(m

mix

)[m

(24)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 10

which symbolically denotes the following post-

processing of measurements M

, M

: consider

a uniform probability distribution



p(i) =



i=1

over the measurement settings {M

}

i=1

and relabel

the respective measurement outcomes, i.e., {m

∈

{0, 1}}

i=1

, to a measurement outcome m

mix

∈ {0, 1}

according to the probability distributions

{{p(m

mix

) = δ

mix

}

mix

∈{0,1}

}

i=1

;

coarse-graining over m

and i then yields the eﬀective

measurement setting M

mix

with outcomes labelled by

mix

∈ {0, 1}. In contrast to the kinds of coarse-

graining (over measurement outcomes) that appear in

KS-noncontextuality (which we discussed in Section

2.3), the (probabilistic) coarse-graining here is over

the measurement settings themselves while retaining

the outcome labels.

We require that this coarse-

graining relation be respected in the operational as

well as the ontological description. In the operational

description, this coarse-graining is represented by

∀[s|S], b ∈ {0, 1} :

p(m

mix

= b, s|M

mix

, S) ≡

i=1

p(m

= b, s|M

, S).

(25)

We require the following operational equivalence

between measurement events of M

mix

and M

fcf

with

respect to which we invoke the assumption of mea-

surement noncontextuality:

∀b ∈ {0, 1} : [m

mix

= b|M

mix

] ' [m

fcf

= b|M

fcf

] (26)

If we then look at an operational quantity quanti-

fying source-measurement correlations, namely,

Corr

fcf

≡

i=1

p(m

, s

, S

), (27)

then the assumption of preparation noncontextuality

applied to operational equivalence in Eq. (23) (so that

µ(λ|S

) = µ(λ|S

) for all λ ∈ Λ) and the

assumption of measurement noncontextuality applied

to the operational equivalence in Eq. (26) (so that

ξ(0|M

, λ) +

ξ(0|M

, λ) +

ξ(0|M

, λ) =

for all

λ ∈ Λ) lead to the following constraint:

Corr

fcf

≤

. (28)

We did not discuss these more general types of classical

post-processing in Section 2.3 because they are not relevant to

the treatment of Kochen-Specker type scenarios in the Spekkens

framework. The example we present here is from Ref. [13],

which is not of Kochen-Specker type. The general principle un-

derlying the representation of such classical post-processings is,

however, the same: they should be respected in the operational

as well as the ontological description.

To see how this is obtained, note that

i=1

p(m

, s

, S

)

i=1

λ∈Λ

ξ(m

, λ)µ(λ, s

)

≤

i=1

λ∈Λ

max

ξ(m

, λ)

µ(λ, s

)

i=1

λ∈Λ

ζ(M

, λ)

µ(λ, s

)

λ∈Λ

i=1

ζ(M

, λ)ν(λ), (29)

where we have that ζ(M

, λ) ≡ max

ξ(m

, λ)

and that ν(λ) ≡ µ(λ|S

) for all i ∈ {1, 2, 3}. This

allows us to put the upper bound

Corr

fcf

≤ max

λ∈Λ

i=1

ζ(M

, λ), (30)

which, subject to the constraint (from measurement

noncontextuality) that

ξ(0|M

, λ) +

ξ(0|M

, λ) +

ξ(0|M

, λ) =

, yields Eq. (28).

It turns out that

in quantum theory the sources and measurements re-

quired for this scenario can be realized on a qubit and

they can, in principle, achieve the value Corr = 1.

This can be achieved by taking the three prepara-

tions to be the trine preparations on an equatorial

plane (say, the Z-X plane) of the Bloch sphere and

the measurements {M

}

i=1

to be the trine measure-

ments, i.e.,

=0|S

]

≡

(I + ~σ.~n

) ≡ Π

=1|S

]

≡

(I −~σ.~n

) ≡ Π

=0|M

]

≡ Π

=1|M

]

≡ Π

, (31)

where ~n

≡ (0, 0, 1), ~n

≡ (

√

, 0, −

), ~n

≡

(−

√

, 0, −

), and ~σ ≡ (σ

, σ

) denotes the three

Pauli matrices σ



0 1

1 0



, σ



0 −i

i 0



, and



1 0

0 −1



. The operational equivalences are

then easy to verify:

[>|S

]

, ∀i ∈ {1, 2, 3},

i=1

. (32)

The reader may look at Appendix B.1 of Ref. [13] to con-

vince themselves that the maximum is achieved for an as-

signment of response functions of the type ξ(0|M

, λ) = 1,

ξ(0|M

, λ) =

and ξ(0|M

, λ) = 0 for some λ.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 11

The quantity Corr

fcf

= 1 from this quantum realiza-

tion. The experimental violation of the noise-robust

noncontextuality inequality, Eq. (28), was demon-

strated in Ref. [13], where more details may be found.

Note that the fair coin ﬂip inequality, Eq.(28), is not

inspired by the kinds of operational equivalences that

are relevant in a proof of the Kochen-Specker theo-

rem, but employs other kinds of operational equiva-

lences allowed in the Spekkens framework [18], i.e.,

the operational equivalences in Eqs. (23) and (26) do

not arise from the same measurement outcome being

shared by diﬀerent measurements.

Our goal in the present paper is to provide a frame-

work for noise-robust noncontextuality inequalities

obtained from statistical proofs of the KS theorem,

in particular those that are covered by the CSW

framework [22], so that such inequalities can be put

to an experimental test along the lines of Ref. [13]

within the Spekkens framework. Hence, the opera-

tional equivalences between measurement events that

will be of interest to us in this paper are precisely

those which allow for a proof of the KS theorem, i.e.,

those which correspond to the same measurement out-

come (e.g., a projector) being shared by diﬀerent mea-

surements (e.g., projective measurements).

2.7 Connection to Bell scenarios

As further motivation to study the questions we

are posing, note that one can also view the general

prepare-and-measure scenario we are considering in

this paper (Fig. 1) as arising on one wing of a two-

party Bell experiment: that is, given two parties –

Alice and Bob – sharing an entangled state and per-

forming local measurements in a Bell experiment, one

can view each choice of measurement setting on Al-

ice’s side as preparing an ensemble of states on Bob’s

side; on account of no-signalling, the reduced state

on Bob’s side will be the same regardless of Alice’s

choice of measurement setting, i.e., all the ensembles

corresponding to Alice’s measurement settings (hence,

Bob’s source settings) will be operationally equiva-

lent.

For example, consider a Bell experiment where

Alice has two choices of measurement settings,

≡ σ

or M

≡ σ

, and she shares a Bell

state with Bob: |ψi =

√

(|00i + |11i). Bob

has access to some set of measurement settings

≡ {M

}

on his system. When Alice mea-

sures M

, she prepares the ensemble of states

≡ {(1/2, ρ

=0|S

]

≡ |+ih+|), (1/2, ρ

=1|S

]

≡

|−ih−|)} on Bob’s side and when she measures

she prepares the ensemble of states S

≡

{(1/2, ρ

=0|S

]

≡ |0ih0|), (1/2, ρ

=1|S

]

≡ |1ih1|)}.

These ensembles are operationally equivalent, yielding

the maximally mixed state on coarse-graining, i.e.,

|0ih0| +

|1ih1| =

|+ih+| +

|−ih−| =

. (33)

The quantity of interest in a Bell experiment

p(m

, m

, M

) (i ∈ {x, z}) is then formally the

same as the quantity p(s

, m

, M

) that we are

interested in our prepare-and-measure scenario. In

the ontological model describing the eﬀective prepare-

and-measure experiment on Bob’s system, we have

the following:

p(s

, m

, M

)

Pr(m

, λ)Pr(λ, s

)

Pr(m

, λ)Pr(s

, λ)Pr(λ|S

). (34)

Assuming preparation noncontextuality relative to

the operational equivalence [>|S

] ' [>|S

], we have

Pr(λ|S

) = Pr(λ|S

) ≡ Pr(λ), so that

p(s

, m

, M

)

Pr(s

, λ)Pr(m

, λ)Pr(λ), (35)

which formally resembles the expression for local

causality when applied to the corresponding two-

party Bell experiment:

p(m

, m

, M

)

Pr(m

, λ)Pr(m

, λ)Pr(λ). (36)

If no other assumption of noncontextuality is in-

voked besides the one applied to the operational

equivalence of source settings on Bob’s system, then

the constraints on p(s

, m

, M

) will be the same

as the constraints on p(m

, m

, M

) from Bell

inequalities.

Note, however, that the response functions

Pr(m

, λ) and Pr(m

, λ) can be completely

arbitrary in a locally causal ontological model for the

Bell experiment and the same applies to the distri-

butions Pr(s

, λ) and Pr(m

, λ) in a prepa-

ration noncontextual model of the corresponding

prepare-and-measure scenario on Bob’s side. We will

be interested in imposing additional constraints on

the response functions Pr(m

, λ) of the prepare-

and-measure scenario (on Bob’s side) that follow from

the assumption of measurement noncontextuality ap-

plied to operational equivalences between measure-

ment events on Bob’s side. In particular, we are inter-

ested in those operational equivalences between mea-

surement events that are required by any statistical

proof of the Kochen-Specker theorem [16, 47]. We

develop this approach more carefully in the following

sections.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 12

3 Hypergraph approach to Kochen-

Specker scenarios in the Spekkens

framework

Having set up the framework needed to articulate

the relevant notions in Section 2, we now proceed to

consider Kochen-Specker type experimental scenarios

in this framework. To do this, we will use the lan-

guage of hypergraphs and their subgraphs to repre-

sent the operational equivalences between measure-

ment events that are required in a Kochen-Specker

argument as well as the operational equivalences be-

tween source settings that we will invoke in our gen-

eralization. The (hyper)graph-theoretic ingredients of

our approach will represent those aspects of the gen-

eral framework of Section 2 that are necessary to go

from the CSW framework for KS-contextuality to a

hypergraph framework for Spekkens contextuality ap-

plied to Kochen-Specker type experimental scenarios.

Our presentation will be a hybrid one, discussing

features of the CSW framework [21, 22] in the nota-

tion of the AFLS framework [23], but extending both

in ways appropriate for the purpose of this paper. Our

goal is to demonstrate how the graph-theoretic invari-

ants of CSW [22] can be repurposed towards obtaining

noise-robust noncontextuality inequalities.

We do this in two parts: ﬁrst, we deﬁne a rep-

resentation of measurement events in the manner of

Refs. [22, 23], and then we deﬁne a representation of

source events in the spirit of Ref. [12].

3.1 Measurements

The basic object for representing measurements is a

hypergraph, Γ, with a ﬁnite set of vertices V (Γ) such

that each vertex v ∈ V (Γ) denotes a measurement

outcome, and a set of hyperedges E(Γ) such that

each hyperedge e ∈ E(Γ) is a subset of V (Γ) and

denotes a measurement consisting of outcomes in e.

Here, E(Γ) ⊆ 2

V (Γ)

and

e∈E(Γ)

e = V (Γ). Such a

hypergraph satisﬁes the deﬁnition of a contextuality

scenario `a la AFLS [23]. We will further assume, un-

less speciﬁed otherwise, that the hypergraph is simple:

that is, for all e

, e

∈ E(Γ), e

⊆ e

⇒ e

= e

, or

that no hyperedge is a strict subset of another. Such

hypergraphs are also called Sperner families [46]. Two

measurement events are said to be (mutually) exclu-

sive if the vertices denoting them appear in a common

hyperedge, i.e., if they can be realized as outcomes of

a single measurement setting.

The structure of a contextuality scenario Γ repre-

sents the operational equivalences between measure-

ment events that are of interest in a Kochen-Specker

argument. We emphasize here that we take the opera-

tional theory to be fundamental and the contextuality

scenario for a particular Kochen-Specker argument to

be derived from (and as a graphical representation of)

the operational equivalences in the operational theory

(cf. Section 2). In particular, depending on the oper-

ational equivalences that an operational theory can

exhibit (by virtue of (in)compatibility relations be-

tween measurements), it may or may not allow some

contextuality scenario to be realized by measurement

events in the theory. The fact that a given vertex,

say v ∈ V (Γ), appears in multiple hyperedges, say

≡ {e ∈ E(Γ)|v ∈ e}, means that the measurement

events corresponding to this vertex, i.e., {[v|e]}

e∈E

are operationally equivalent, and the equivalence class

of these measurement events is denoted by the vertex

v itself. In the case of quantum theory, for example,

v can represent a positive operator that appears in

diﬀerent positive operator-valued measures (POVMs)

represented by the hyperedges.

A probabilistic model on Γ is an assignment of prob-

abilities to the vertices v ∈ V (Γ) such that p(v) ≥ 0

for all v ∈ V (Γ) and

v∈e

p(v) = 1 for all e ∈ E(Γ).

As we have noted, every vertex v represents an equiv-

alence class of measurement events, denoted [m|M],

and every hyperedge e represents an equivalence class

of measurement procedures, denoted M.

The fact

that each v represents an equivalence class of mea-

surement events means that

1. any probabilistic model p on Γ, realized by op-

erational probabilities for a given source event –

that is, where for all v ∈ V (Γ) and a given [s|S],

p(v) ≡ p(v|S, s) ≡ p(m|M, S, s) – is consistent

with the operational equivalences represented by

Γ, and

2. any probabilistic model on Γ, realized by ontolog-

ical probabilities for a given ontic state – that is,

where for all v ∈ V (Γ) and a given ontic state λ,

p(v) ≡ p(v|λ) ≡ ξ(m|M, λ) – respects (by deﬁni-

tion) the assumption of measurement noncontex-

tuality with respect to the presumed operational

equivalences between measurement events.

We will therefore often write p(m, s|M, S) as

p(v, s|S) and p(m|M , S, s) as p(v|S, s), where [s|S] is

a source event. Similarly, we will also write ξ(m|M, λ)

as p(v|λ), where λ is an ontic state.

Orthogonality graph of Γ, O(Γ): Given the hy-

pergraph Γ, we construct its orthogonality graph

O(Γ): that is, the vertices of O(Γ) are given by

V (O(Γ)) ≡ V (Γ), and the edges of O(Γ) are given

by E(O(Γ)) ≡ {{v, v

}|v, v

∈ e for some e ∈ E(Γ)}.

Note that two measurement procedures with measurement

settings M and M

are operationally equivalent if every mea-

surement event of one is operationally equivalent to a distinct

measurement event of the other. That is, there is a bijective

correspondence (of operational equivalence) between the two

sets of measurement events. In quantum theory, for example,

a given POVM (which is what a hyperedge would represent),

say {E

}

, can be implemented in many possible ways, each

such measurement procedure corresponding to diﬀerent quan-

tum instrument. Mathematically, these diﬀerent procedures

can be represented by diﬀerent sets of operators {O

}

such

that E

= O

†

for all k and

†

= I.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 13

Figure 2: The KCBS scenario with 4-outcome joint measure-

ments, visualized as a hypergraph Γ [16, 22, 47].

Each edge of O(Γ) denotes the exclusivity of the two

measurement events it connects, i.e., the fact that

they can occur as outcomes of a single measurement.

For any Bell-KS inequality constraining correla-

tions between measurement events from O(Γ) (when

all measurements are implemented on a given source

event), we construct a subgraph G of O(Γ) such that

the vertices of G, i.e., V (G), correspond to mea-

surement events that appear in the inequality with

nonzero coeﬃcients, and two vertices share an edge

in G if and only if they share an edge in O(Γ). More

explicitly, consider a Bell-KS expression

R([s|S]) ≡

v∈V (G)

p(v|S, s), (37)

where w

> 0 for all v ∈ V (G). A Bell-KS in-

equality imposes a constraint of the form R([s|S]) ≤

, where R

is the upper bound on the expres-

sion in any operational theory that admits a KS-

noncontextual ontological model. Often, but not al-

ways, these inequalities are simply of the form where

= 1 for all v ∈ V (G). In keeping with the CSW

notation [22], we will denote the general situation by

a weighted graph (G, w), where w is a function that

maps vertices v ∈ V (G) to weights w

> 0. See Fig-

ures 2 and 3 for an example from the Klyachko-Can-

Binicio˘glu-Shumovsky (KCBS) scenario [22, 47].

Below, we make some remarks clarifying the scope

of the framework described above before we move to

the case of sources.

3.1.1 Classiﬁcation of probabilistic models

We classify the probabilistic models on a hypergraph

Γ as follows:

• KS-noncontextual probabilistic models, C(Γ): a

probabilistic model which is a convex combina-

tion of deterministic assignments p : V (Γ) →

Figure 3: A subgraph of KCBS hypergraph Γ, representing

orthogonality relations of the events of interest in the KCBS

inequality [22, 47].

{0, 1}, where

v∈e

p(v) = 1 for all e ∈ E(Γ).

In Ref. [23], this is referred to as a “classical

model”.

Note that we call Γ KS-colourable if C(Γ) 6= ∅

and we call it KS-uncolourable if C(Γ) = ∅. Our

terminology here is inspired by the traditional

usage of the term “Kochen-Specker colouring” to

refer to an assignment of two colours to vectors

satisfying some orthogonality relations under the

colouring constraints of the KS theorem [48].

• Consistent exclusivity satisfying probabilistic

models, CE

(Γ): a probabilistic model on Γ,

p : V (Γ) → [0, 1], such that (in addition to sat-

isfying the deﬁnition of a probabilistic model),

v∈c

p(v) ≤ 1 for all cliques c in the orthogo-

nality graph O(Γ). This is the same as the set of

E1 probabilistic models of Ref. [22].

Note that a clique in the orthogonality graph

O(Γ) is a set of vertices that are pairwise exclu-

sive (i.e., every vertex in this set shares an edge

with every other vertex).

• General probabilistic models, G(Γ): Any p that

satisﬁes the deﬁnition of a probabilistic model is

a general probabilistic model, i.e., it can arise

from measurements in some general probabilistic

theory [1] that isn’t necessarily quantum.

The set of all probabilistic models G(Γ) (for any

Γ) forms a polytope since it is deﬁned by just

the positivity and normalization constraints on

the probabilities. The extremal points (or ver-

tices) of this polytope fall into two categories

that will interest us: deterministic and indeter-

ministic. The deterministic extremal points are

We use a diﬀerent term because we are advocating a revi-

sion of the notion of classicality from KS-noncontextuality to

generalized noncontextuality `a la Spekkens.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 14

the p : V (Γ) → {0, 1} such that

v∈e

p(v) = 1

for all e ∈ E(Γ) and we denote the set of these

points by G(Γ)|

det

. The indeterministic extremal

points are the p ∈ G(Γ) which are not determinis-

tic and which, furthermore, cannot be expressed

as a convex mixture of other points in G(Γ).

We denote the set of indeterministic extremal

points by G(Γ)|

ind

. Clearly, G(Γ)|

det

( C(Γ) and

G(Γ)|

ind

⊆ G(Γ)\C(Γ).

Overall, we have

C(Γ) ⊆ CE

(Γ) ⊆ G(Γ) (38)

for any hypergraph Γ.

3.1.2 Distinguishing two consequences of Specker’s

principle: Structural Specker’s principle vs. Statistical

Specker’s principle

The CSW framework [22] restricts the scope of prob-

abilistic models on a hypergraph to those satisfying

consistent exclusivity (the E1 probabilistic models),

motivated by what is sometimes called Specker’s prin-

ciple [35]: that is,

“if you have several questions and you can

answer any two of them, then you can also

answer all of them”

If by “questions” we understand measurement set-

tings, then the principle says that a set of pairwise

jointly implementable measurement settings is itself

jointly implementable. Note that when we say a set

of measurement settings is “jointly implementable”,

“jointly measurable”, or “compatible”, we mean that

there exists another choice of a single measurement

setting in the theory such that this measurement set-

ting can reproduce the statistics of all the measure-

ment settings in the set by coarse-graining.

As such,

in its application to measurement settings, Specker’s

principle is a constraint on the measurements allowed

in a physical theory that respects it, e.g., measure-

ment settings that correspond to PVMs (projection

valued measures) in quantum theory. This is, for ex-

ample, the reading adopted in Ref. [49], where the

failure of Specker’s principle in any almost quantum

theory was demonstrated. On the other hand, we will

often also refer to the “joint measurability” of a set of

measurement events, by which we mean that this set

of measurement events is a subset of the set of mea-

surement outcomes for some choice of measurement

setting. At the level of measurement events,

then,

there are two distinct ways to read Specker’s principle

The reader may recall from Section 2.4 the general deﬁni-

tion of compatibility. Also, see Ref. [44] for an overview of joint

measurability in quantum theory.

Recall that a measurement event is a measurement outcome

given a choice of measurement setting, e.g., a projector that

appears in a particular PVM in quantum theory.

that one needs to keep in mind which we distinguish as

structural Specker’s principle vs. statistical Specker’s

principle. We deﬁne these two readings below:

• Structural Specker’s principle imposes a struc-

tural constraint on a contextuality scenario Γ.

This (strong) reading of Specker’s principle ap-

plies to any set of measurement events, say M ⊆

V (Γ), where every pair of measurement events

can arise as outcomes of a single measurement:

that is, for each pair {v, v

} ⊆ M, there exists

some e ∈ E(Γ) such that {v, v

} ⊆ e. The princi-

ple then states:

Given a set M of pairwise jointly measurable

measurement events in some contextuality sce-

nario Γ, all the measurement events in M are

jointly measurable, i.e., all the measurement

events in the set can arise as outcomes of a single

measurement: M ⊆ e for some e ∈ E(Γ).

Alternatively, the constraint of structural

Specker’s principle can be restated as:

Every clique in the orthogonality graph of Γ,

O(Γ), is a subset of some hyperedge in Γ.

Note that we haven’t said anything directly

about probabilities here: any Γ satisfying the

above property is said to satisfy structural

Specker’s principle.

• Statistical Specker’s principle (or consistent ex-

clusivity) imposes a statistical constraint on prob-

abilistic models on any contextuality scenario Γ

representing measurement events in an opera-

tional theory.

This (weak) reading of Specker’s principle im-

poses an additional constraint on a probabilistic

model p ∈ G(Γ) (thus deﬁning CE

(Γ) ⊆ G(Γ)),

namely:

Given a set M of pairwise jointly measurable

measurement events, p satisﬁes

v∈M

p(v) ≤ 1.

This can also be expressed as:

A probabilistic model p ∈ G(Γ) is said to satisfy

statistical Specker’s principle if the sum of prob-

abilities it assigns to the vertices of every clique

in the orthogonality graph of Γ, O(Γ), does not

exceed 1, i.e.,

v∈c

p(v) ≤ 1 for all cliques c in

O(Γ).

All probabilistic models that satisfy this con-

straint deﬁne the set of probabilistic models

(Γ) (or E1) for any contextuality scenario

Γ regardless of whether Γ satisﬁes structural

Specker’s principle. Clearly, CE

(Γ) ⊆ G(Γ).

Any probabilistic model p on Γ such that p ∈

(Γ) is said to satisfy statistical Specker’s prin-

ciple or, equivalently, consistent exclusivity [23].

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 15

Probabilistic models on any hypergraph Γ which

satisﬁes the (strong) structural Specker’s principle ob-

viously satisfy the (weak) statistical Specker’s princi-

ple. This holds simply on account of the structure of

such Γ: that is, for all Γ satisfying structural Specker’s

principle, we have CE

(Γ) = G(Γ). To see this, note

that every clique c in O(Γ) is a subset of some hyper-

edge in Γ, hence for every clique c,

v∈c

p(v) ≤ 1 for

all p ∈ G(Γ), i.e., p ∈ CE

(Γ).

On the other hand,

it remains an open question whether the converse is

true:

That is, given that CE

(Γ) = G(Γ) for some Γ, is it

the case that Γ must then necessarily satisfy structural

Specker’s principle, namely, that every clique in O(Γ)

is a subset of some hyperedge in Γ?

A positive answer to this question would answer

Problem 7.2.3 of Ref. [23] asking for a characterization

of Γ for which CE

(Γ) = G(Γ).

3.1.3 What does it mean for an operational theory to

satisfy structural/statistical Specker’s principle?

We have so far deﬁned structural Specker’s principle

as a constraint on Γ and statistical Specker’s princi-

ple as a constraint on a probabilistic model on any Γ.

Any operational theory would typically allow many

possible Γ to be realized by its measurement events

as well as many possible probabilistic models to be re-

alized on any Γ representing its measurement events.

Note that when we say that a particular Γ is “re-

alizable” or “allowed” by an operational theory, we

mean that there exist measurement events in the op-

erational theory that satisfy the operational equiva-

lences required by Γ.

Further, given such a Γ, the

realizability of a probabilistic model on it by the oper-

ational theory means that there exists a source event

in the operational theory that assigns probabilities to

the measurement events in Γ according to the proba-

bilistic model. It will be useful for our discussion to

deﬁne what it means for an operational theory, say T,

to satisfy structural or statistical Specker’s principle.

But before we do that, let us formally specify what it

means for T to satisfy Specker’s principle:

T satisﬁes Specker’s principle: An operational

theory T is said to satisfy Specker’s principle if, for

any set of measurement settings in T that are pair-

wise jointly implementable, it follows that they are all

jointly implementable in T.

This partially answers the open Problem 7.2.3 of Ref. [23].

Realizability of a particular Γ in an operational theory de-

pends on the (in)compatibility relations that the operational

theory allows between its measurements (cf. Section 2.4). Re-

call that incompatibility of measurements is necessary for KS-

contextuality to be witnessed and the structure of Γ depends

on this incompatibility.

Recall from Section 2.4 the deﬁnition of joint imple-

mentability (or joint measurability) of some set of measurement

settings.

We denote by T(Γ) the set of probabilistic mod-

els achievable on Γ by an operational theory T, i.e.,

for any p ∈ T(Γ), we have that ∀v ∈ V (Γ) : p(v) =

p(v|S, s) for some source event [s|S] possible in the op-

erational theory T.

Since an operational theory can

only put further constraints on probabilistic models

in G(Γ), we obviously have: T(Γ) ⊆ G(Γ).

1. T satisﬁes statistical Specker’s principle:

We say an operational theory T satisﬁes statisti-

cal Specker’s principle if T(Γ) ⊆ CE

(Γ) ⊆ G(Γ)

for all Γ.

Since the satisfaction of statistical Specker’s prin-

ciple is a constraint on the statistical predictions

of T, there must be some fact about the struc-

ture of theory T that leads to this constraint.

This fact enforcing statistical Specker’s principle

could be some restriction arising from the struc-

ture of allowed measurement events and/or even

the structure of allowed preparations in the oper-

ational theory T. For instance, this is the case for

quantum theory when one only considers projec-

tive measurements implemented on an arbitrary

quantum state, i.e., Q(Γ) ⊆ CE

(Γ) ⊆ G(Γ),

where Q(Γ) denotes the set of probabilistic mod-

els that can be obtained in this way. More gener-

ally, one could relax the no-restriction hypothe-

sis [3] in some particular way in T so that not all

probabilistic models in G(Γ) are allowed in T(Γ).

In the case of quantum theory, restricting atten-

tion to only projective measurements (as we just

pointed out) rather than the more general case

allowing arbitrary POVMs is one way of restrict-

ing the set of possible probabilistic models re-

alizable with quantum states and measurements

to a strict subset of G(Γ). Allowing arbitrary

POVMs would lead to a violation of statistical

Specker’s principle by probabilistic models aris-

ing from quantum theory.

Let us now deﬁne what it means for an oper-

ational theory T to satisfy structural Specker’s

principle.

2. T satisﬁes structural Specker’s principle:

An operational theory T is said to satisfy struc-

tural Specker’s principle if for any set of mea-

surement events that are pairwise jointly measur-

able, i.e, measurement events in each pair arise

Note that if the operational theory does not admit mea-

surement events (represented by vertices) exhibiting the opera-

tional equivalences represented by Γ (that is, T does not allow

Γ), then we have that T(Γ) is an empty set.

That is, instead of considering only a particular probabilis-

tic model on a particular Γ, we now consider the satisfaction

of statistical Specker’s principle by a whole set of probabilistic

models, namely, T (Γ), for all Γ.

See Appendices A (speciﬁcally A.1.2) and C for other con-

sequences of allowing arbitrary POVMs, in particular the trivial

‘classical’ ones.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 16

as outcomes of some measurement in the theory,

it is the case that all the measurement events in

the set are jointly measurable, i.e., all the mea-

surement events in the set arise as outcomes of

a single measurement in the theory.

We now show that a theory T that satisﬁes

Specker’s principle also satisﬁes structural Specker’s

principle.

Theorem 1. If an operational theory T satisﬁes

Specker’s principle, then it also satisﬁes structural

Specker’s principle.

Proof. The argument here relies on the fact that the

operational theory T is such that measurement set-

tings can be coarse-grained to yield new measurement

settings with fewer outcomes. Operationally, this just

corresponds to binning some subsets of outcomes to-

gether in a measurement procedure. The operational

theories we consider in this paper satisfy this prop-

erty, as outlined in Section 2.3 on coarse-graining.

The argument proceeds, for any Γ realizable in T,

by constructing a set of binary-outcome measurement

settings for any given set of pairwise jointly mea-

surable vertices in Γ. These measurement settings

are, by construction, pairwise jointly measurable, so

Specker’s principle applied to them implies that they

are all jointly measurable. This in turn means that the

pairwise jointly measurable vertices in the given set

are also all realizable as outcomes of a single measure-

ment setting. Hence, the theory T satisﬁes structural

Specker’s principle. We detail the argument below.

Consider a contextuality scenario Γ realizable in

T. To each vertex v ∈ V (Γ), we can associate a

measurement setting M

with two possible outcomes

labelled {0, 1} such that [1|M

] denotes the occur-

rence of v and [0|M

] denotes the non-occurrence of

v, i.e., p(v|S, s) = p(1|M

, S, s) and 1 − p(v|S, s) =

p(0|M

, S, s) for any probabilistic model on Γ induced

by some source event [s|S]. The measurement setting

can be obtained in various (operationally equiva-

lent) ways from the hyperedges that v ∈ V (Γ) appears

in: for each hyperedge e ∈ E(Γ) such that v ∈ e,

we have that the binary-outcome measurement set-

ting consisting of the vertices {v, e\v} — where e\v

denotes a coarse-graining over all the measurement

outcomes of e except v — is operationally equivalent

to M

Now, for any pair of vertices {v, v

} that appear in a

common hyperedge of Γ, consider the two correspond-

ing measurement settings {M

, M

} such that they

are jointly measurable and their outcomes are mutu-

ally exclusive. The measurement events that can pos-

sibly occur in their joint measurement, denoted M

are [10|M

], [01|M

] and [00|M

]. The probabil-

ity of [11|M

] is always zero, reﬂecting the fact that

v and v

are mutually exclusive. Here, the coarse-

graining relations are: [1|M

] ≡ [10|M

], [1|M

] ≡

[01|M

], [0|M

] ≡ [00|M

] + [01|M

], [0|M

] ≡

[00|M

] + [10|M

The joint measurement M

can be constructed

from any hyperedge that v and v

appear in: for

any e ∈ E(Γ) such that {v, v

} ⊆ e, we have

that [10|M

] is a measurement event correspond-

ing to v,

[01|M

] corresponds to v

, [00|M

]

corresponds to e\{v, v

} (the coarse-graining of all

measurement outcomes in e except v and v

), and

[11|M

] denotes the null event ∅ ⊆ e. This means

p(10|M

, S, s) +p(01|M

, S, s) +p(00|M

, S, s) =

p(v|S, s) + p(v

|S, s) + p(e\{v, v

}|S, s) = 1 and

p(11|M

, S, s) = 0 for any probabilistic model (in-

duced by some source event [s|S]) on Γ.

Consider now any set of vertices in Γ that is pair-

wise jointly measurable, denoted V

2JM

⊆ V (Γ). We

need to show that any such set of vertices V

2JM

jointly measurable, i.e., the theory T realizing Γ ad-

mits a single measurement such that all the vertices

in V

2JM

arise as outcomes of this measurement.

Now, the two-outcome measurement settings

|v ∈ V

2JM

} we have deﬁned are pairwise jointly

measurable and as such, following Specker’s principle,

they should all be jointly measurable in theory T. The

joint measurement corresponding to them can be de-

ﬁned as

2JM

≡ {[

b|M

2JM

]



b ∈ {0, 1}

2JM

}, (39)

where each event [

b|M

2JM

] in the joint measurement

2JM

represents a particular set of outcomes for mea-

surements in the set {M

|v ∈ V

2JM

Denoting V

2JM

≡ {v

, v

, . . . , v

2JM

}, we have that

[(10 . . . 0)|M

2JM

] ≡ [1|M

[(01 . . . 0)|M

2JM

] ≡ [1|M

[(00 . . . 1)|M

2JM

] ≡ [1|M

2JM

[(00 . . . 0)|M

2JM

]

≡ [0|M

] + [0|M

] + ··· + [0|M

2JM

], (40)

where [0|M

]+[0|M

]+···+[0|M

2JM

] denotes the

measurement event obtained by coarse-graining the

measurement events in {[0|M

]|v ∈ |V

2JM

|}. All the

other measurement events of M

2JM

are null events

that never occur, i.e., they are assigned probability

zero by every source event. Thus, using Specker’s

principle applied to the binary-outcome measurement

settings deﬁned for the vertices in V

2JM

, we have that

the pairwise jointly measurable vertices in V

2JM

are all

jointly measurable, appearing as outcomes of a single

measurement M

2JM

Recall that every vertex v ∈ V (Γ) is an equivalence class

of measurement events [v|e] ' [v|e

] for all e, e

such that v ∈ e

and v ∈ e

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 17

Having established Theorem 1, we now proceed to

show that a theory which satisﬁes structural Specker’s

principle also satisﬁes statistical Specker’s principle.

To do this, we consider a contextuality scenario Γ

which may not satisfy structural Specker’s princi-

ple and from it construct a contextuality scenario Γ

which does satisfy the principle. The construction

proceeds as follows:

1. Construct O(Γ).

2. Turn each clique in O(Γ) that is a hyperedge in Γ

to a hyperedge in a new hypergraph Γ

. That is,

is such that V (Γ) ⊆ V (Γ

) and E(Γ) ⊆ E(Γ

3. Turn each maximal clique c in O(Γ) that is not a

hyperedge in Γ to a hyperedge in Γ

and include

an additional vertex v

in this hyperedge. Here,

a maximal clique in a graph is a clique that is

not a strict subset of another clique, i.e., there is

no vertex outside the clique that shares an edge

with each vertex in the clique.

We then have for the hyperedges of Γ

E(Γ

) = E(Γ) ∪ {c ∪ {v

}}

c∈C

, (41)

where C is the set of maximal cliques in O(Γ)

that are not hyperedges in Γ.

Note that as long as a theory T satisﬁes structural

Specker’s principle, converting maximal cliques

in O(Γ) that are not hyperedges in Γ to hyper-

edges in Γ

is a valid move within the theory since

the resulting hyperedge would indeed constitute

a valid measurement in the theory.

If C = ∅ (i.e., Γ satisﬁes structural Specker’s

principle), then we just have E(Γ

) = E(Γ).

4. The resulting contextuality scenario Γ

is thus

given by: V (Γ

) = V (Γ) ∪ {v

}

c∈C

and E(Γ

) =

E(Γ) ∪ {c ∪ {v

}}

c∈C

If C = ∅ we just have V (Γ

) = V (Γ) and E(Γ

) =

E(Γ) so that Γ

= Γ (i.e., the two hypergraphs

are isomorphic).

Our construction of Γ

leads to the following prop-

erties:

• Γ

satisﬁes structural Specker’s principle (by con-

struction) since every clique in O(Γ

) is a subset

of some hyperedge in Γ

. Hence, it’s also the case

that statistical Specker’s principle holds for prob-

abilistic models on Γ

as CE

(Γ

) = G(Γ

Note that the construction of Γ

relied on the fact

that the theory we are considering satisﬁes struc-

tural Specker’s principle. If the theory doesn’t

satisfy this principle, but one goes ahead with

the construction of Γ

, then the new hyperedges

in Γ

may not constitute valid measurements in

the theory.

• Probabilistic models in G(Γ

) are in bijective cor-

respondence with probabilistic models in CE

(Γ):

for any probabilistic model p

∈ CE

(Γ), there

exists a unique probabilistic model p

≡ f (p

) ∈

G(Γ

), where the function f is given by p

(v) ≡

f(p

)(v) = p

(v) for all v ∈ V (Γ) and p

) ≡

f(p

)(v

) = 1 −

v∈c

(v) for all c ∈ C.

Sim-

ilarly, for any p

∈ G(Γ

), there exists a unique

probabilistic model p

≡ g(p

) ∈ CE

(Γ) given

by p

(v) ≡ g(p

)(v) = p

(v) for all v ∈ V (Γ),

i.e., we simply ignore the probabilities assigned

to the vertices v

∈ V (Γ

)\V (Γ) which do not

appear in Γ. Now note that the functions f and

g are inverses of each other: g(f (p

)) = g(p

) =

and f(g(p

)) = f(p

) = p

. Hence, there

is a bijective correspondence between G(Γ

) and

(Γ).

• Hence, the set of probabilistic models on Γ

that satisfy statistical Specker’s principle, i.e.,

(Γ), are in one-to-one correspondence with

the set of probabilistic models on Γ

which (by

construction) satisﬁes structural Specker’s prin-

ciple so that CE

(Γ

) = G(Γ

We therefore have that CE

(Γ) = C E

(Γ

V (Γ)

where CE

(Γ

V (Γ)

denotes the probabilistic

models induced on Γ by those on Γ

(ignoring the

probabilities assigned to vertices in V (Γ

)\V (Γ)).

It is conceivable that a particular Γ may not ad-

mit probabilistic models from an operational theory

T, i.e., T(Γ) = ∅. On the other hand, if Γ admits a

representation in terms of measurement events admis-

sible in T, so that T(Γ) 6= ∅, then two possibilities

arise: Γ satisﬁes structural Specker’s principle or it

doesn’t. If Γ satisﬁes structural Specker’s principle

then any probabilistic model in T(Γ) will satisfy sta-

tistical Specker’s principle and we have Γ

= Γ. If

Γ does not satisfy structural Specker’s principle, we

consider its relation with the contextuality scenario

constructed from it that does satisfy structural

Specker’s principle. Such a Γ

admits a representa-

tion in a theory T satisfying structural Specker’s prin-

ciple (that is, T(Γ

) 6= ∅) as long as Γ admits such

a representation (that is, T(Γ) 6= ∅). Indeed, it’s

the satisfaction of structural Specker’s principle in T

that renders the construction of Γ

from Γ physically

allowed in T.

Thus, in a theory T that satisﬁes structural

Specker’s principle, the following holds: for every

probabilistic model p

∈ T(Γ) (⊆ CE

(Γ)), Γ

ad-

mits a corresponding probabilistic model p

∈ T(Γ

)

satisfying p

(v) = p

(v) for all v ∈ V (Γ) and

) = 1 −

v∈c

(v) for all c ∈ C, where C is

the set of maximal cliques in O(Γ) such that none of

them is a hyperedge in Γ. Similarly, given p

∈ T(Γ

)

Recall that {v

}

c∈C

= V (Γ

)\V (Γ).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 18

(⊆ CE

(Γ

)), p

∈ T(Γ) is uniquely ﬁxed: it’s ob-

tained by just neglecting the probabilities assigned

by p

to the vertices in V (Γ

)\V (Γ).

We must therefore have T(Γ) = T(Γ

)



V (Γ)

for any

Γ, where T(Γ

)



V (Γ)

denotes the set of probabilistic

models induced on Γ by the set of probabilistic models

in T(Γ

) under the correspondence we have already

established above. We can now state and prove the

following theorem:

Theorem 2. If an operational theory T satisﬁes

structural Specker’s principle, then it also satisﬁes

statistical Specker’s principle.

Proof. For any Γ that does not admit a probabilistic

model in T, i.e., T(Γ) = ∅, statistical Specker’s prin-

ciple is trivially satisﬁed since T(Γ) = ∅ ⊆ CE

(Γ) ⊆

G(Γ).

For any Γ that does admit a probabilistic model in

T, i.e., T(Γ) 6= ∅, we can have one of two possibili-

ties: either it satisﬁes structural Specker’s principle,

in which case T(Γ) ⊆ CE

(Γ) = G(Γ), or it doesn’t,

in which case we consider the Γ

constructed from it

following the recipe we have already outlined so that

we have:

T(Γ

)



V (Γ)

⊆ G(Γ

)



V (Γ)

= CE

(Γ

)



V (Γ)

(Γ).

Since T satisﬁes structural Specker’s principle, we

have T(Γ) = T(Γ

)



V (Γ)

, which immediately implies

that T(Γ) ⊆ CE

(Γ). That is, the theory T satisﬁes

statistical Specker’s principle on Γ: T(Γ) ⊆ CE

(Γ) ⊆

G(Γ).

Overall, we have the desired result: T satisﬁes

structural Specker’s principle ⇒ T(Γ) ⊆ CE

(Γ) ⊆

G(Γ) for all Γ, i.e., T satisﬁes statistical Specker’s

principle.

Thus, one way of enforcing that a particular opera-

tional theory T satisﬁes statistical Specker’s principle

— that is, T(Γ) ⊆ CE

(Γ) ⊆ G(Γ) for all Γ — is

to require that it satisﬁes structural Specker’s prin-

ciple, a constraint on the structure of measurement

events in T. This is, for example, what is achieved in

Ref. [30] by invoking a notion of “sharpness” for mea-

surement events in an operational theory such that

any set of sharp measurement events that are pairwise

jointly measurable are all jointly measurable. That

is, structural Specker’s principle is satisﬁed in a the-

ory with such sharp measurement events and, con-

sequently, statistical Specker’s principle, or what is

more conventionally called consistent exclusivity [23],

is also satisﬁed. But it’s conceivable that there may

be other ways to ensure that only a subset of CE

(Γ)

probabilistic models are allowed in T(Γ) for any Γ.

What we wish to emphasize here is that it is by no

means obvious (or at least, it needs to be proven) that

the only way to restrict the set of probabilistic models

T(Γ) to a subset of CE

(Γ) for any Γ is to require that

the theory T satisfy structural Specker’s principle.

Corollary 1. For any operational theory T, the fol-

lowing implications hold:

T satisﬁes Specker’s principle

⇒T satisﬁes structural Specker’s principle (42)

⇒T satisﬁes statistical Specker’s principle,

i.e., consistent exclusivity. (43)

Proof. This follows from combining Theorems 1 and

Note that statistical Specker’s principle (or consis-

tent exclusivity) is so intrinsic to the CSW approach

[22] that they do not consider probabilistic models

that do not satisfy this principle.

This will become

important when we consider the fact that nonprojec-

tive measurements in quantum theory do not satisfy

Specker’s principle, structural or statistical (at the

level of measurement events), and thus also fail to

satisfy the stronger statement of Specker’s principle

for measurement settings (cf. Ref. [49]). Indeed, such

measurements admit contextuality scenarios Γ that

are not possible with projective measurements, such

as the one from three binary-outcome POVMs that

are pairwise jointly measurable but not triplewise so

[39–41], and the probabilistic models they give rise to

can only be accommodated in the most general set

of probabilistic models, G(Γ), since trivial POVMs

can realize any probabilistic model. Specker’s prin-

ciple, structural Specker’s principle, and statistical

Specker’s principle were all motivated by the fact that

projective measurements in quantum theory satisfy

them. In particular, consistent exclusivity (or sta-

tistical Specker’s principle) would be obeyed in any

theory where measurement events satisfy structural

Specker’s principle, and indeed, the more recent ap-

proach [29] is to restrict attention to “sharp” mea-

surements in such theories [30, 31], where the def-

inition of “sharp” ensures the property of pairwise

jointly measurable events being globally jointly mea-

surable. This property forms the motivational basis

Indeed, any putative theory yielding the set of almost quan-

tum correlations (which satisfy statistical Specker’s principle)

[50] cannot satisfy Specker’s principle — that pairwise joint im-

plementable measurement settings are all jointly implementable

— for any notion of sharp measurements [49]. Whether struc-

tural Specker’s principle, which is deﬁned at the level of mea-

surement events, can be upheld for an almost quantum theory

— so that it falls in the category of operational theories with

sharp measurements envisaged in Ref. [30] — remains an open

question.

As we have already noted, a noise-robust noncontextuality

inequality of the type in Ref. [12] that is based on a logical

proof of the KS theorem is not even obtainable if one restricted

attention to probabilistic models satisfying CE

. The upper

bound on that inequality comes from a probabilistic model that

does not satisfy CE

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 19

(and is suﬃcient) for statistical Specker’s principle to

hold (cf. Theorem 2). That is, this approach [29, 30]

regards statistical Specker’s principle as grounded in

(and physically justiﬁed by) structural Specker’s prin-

ciple. Theorem 2 is a precise statement of this in-

tuition in the hypergraph formalism `a la AFLS [23].

The work of Refs. [30, 31] can be understood as bridg-

ing the gap between structural Specker’s principle and

statistical Specker’s principle by formally deﬁning a

notion of sharp measurements in an operational the-

ory such that structural Specker’s principle holds for

these sharp measurements.

On the other hand, and this is the key point for

our purposes, if one wants to make no commitment

about the representation of measurements in the op-

erational theory (in particular, not requiring a notion

of “sharpness”), then Specker’s principle is not a nat-

ural constraint to impose on probabilistic models and,

indeed, one must deal with the full set of probabilistic

models G(Γ) on any contextuality scenario Γ rather

than restrict oneself to the set of probabilistic models

(Γ). It is for this reason that we are translating

the notions from CSW [22] to the notational conven-

tions of AFLS [23], the latter being a more natural

choice for our purposes, allowing the language needed

to articulate the diﬀerence between CE

(Γ) and G(Γ)

rather than excluding the latter by ﬁat or, perhaps, by

an appeal to structural Specker’s principle holding for

sharp measurements in the landscape of operational

theories under consideration (cf. Theorem 2). It is for

all these reasons that the “exclusivity principle” `a la

CSW [22] is not enough to make sense of Spekkens

contextuality applied to Kochen-Specker type scenar-

ios. The framework we propose in this paper ad-

dresses this gap between the notions Spekkens con-

textuality (which applies to arbitrary measurements)

requires in a hypergraph framework and those that

the CSW framework [22] (which applies to “sharp”

measurements) can provide in its graph-theoretic for-

mulation.

3.1.4 Remark on the classiﬁcation of probabilistic mod-

els: why we haven’t deﬁned “quantum models” as those

obtained from projective measurements

The reader may note that we haven’t tried to de-

ﬁne any notion of a “quantum model” so far, hav-

ing only adopted the deﬁnitions of Ref. [23] for KS-

noncontextual models (C(Γ)), for models satisfying

consistent exclusivity (CE

(Γ)), and for general prob-

abilistic models (G(Γ)). The reason for this is that

we do not wish to restrict ourselves to projective mea-

surements in deﬁning a “quantum model”, unlike the

traditional Kochen-Specker approaches [22, 23]. In

Ref. [23], a quantum model is deﬁned as a probabilis-

tic model that can be realized in the following manner:

assign projectors {Π

}

v∈V (Γ)

(deﬁned on any Hilbert

space) to all the vertices of Γ such that

v∈e

= I

for all e ∈ E(Γ), and we have p(v) = Tr(ρΠ

), for

some density operator ρ on the Hilbert space, I being

the identity operator.

On the other hand, allowing arbitrary positive

operator-valued measures (POVMs) in a deﬁnition

of a quantum model (as we would rather prefer)

means that, in fact, quantum models on a hyper-

graph Γ are as general as the general probabilistic

models G(Γ), rendering such a deﬁnition redundant.

This can be seen by noting that for any probabilistic

model p ∈ G(Γ), one can associate positive opera-

tors to the vertices of Γ given by p(v)I such that for

any quantum state ρ on some Hilbert space, we have

p(v) = Tr(ρp(v)I), where I is the identity operator.

Our focus in this paper is not on quantum the-

ory, in particular, even though the need to be able

to handle noisy measurements and preparations (par-

ticularly, trivial POVMs) in quantum theory can be

taken as a motivation for this work. Rather, our focus

is on delineating the boundary between operational

theories that admit noncontextual ontological mod-

els (for Kochen-Specker type experiments, suitably

augmented with multiple preparation procedures, as

outlined in this paper) and those that don’t by ob-

taining noise-robust noncontextuality inequalities. In

particular, we want these inequalities to indicate the

noise thresholds beyond which an experiment cannot

rule out the existence of a noncontextual ontological

model with respect to the quantities of interest. This

also means that making sense of quantum correlations

in this approach requires one to pay attention not only

to the measurements involved in an experiment but

also the preparations; indeed, this shift of focus from

measurements alone, to include multiple preparations

(or source settings), is a fundamental conceptual dif-

ference between our approach and that of traditional

Kochen-Specker contextuality frameworks [22, 23, 25].

3.1.5 Scope of this framework

Note that whenever we refer to the “CSW frame-

work”, we mean the framework of Ref. [22], which

often diﬀers from the framework of Ref. [21] in some

respects, e.g., the normalization of probabilities in a

given hyperedge, assumed in [22], but not in [21]. In

Ref. [21], the authors write:

Notice that in all of the above we never

require that any particular context should be

associated to a complete measurement: the

conditions only make sure that each context

is a subset of outcomes of a measurement

and that they are mutually exclusive. Thus,

unlike the original KS theorem, it is clear

that every context hypergraph Γ has always

a classical noncontextual model, besides pos-

sibly quantum and generalized models.

On the other hand, in Ref. [22], they write:

The fact that the sum of probabilities

of outcomes of a test is 1 can be used to

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 20

Figure 4: The KS-uncolourable hypergraph from Ref. [51]

that is not covered by our generalization of the CSW frame-

work. We denote this hypergraph as Γ

express these correlations as a positive lin-

ear combination of probabilities of events,

S =

P (e

), with w

> 0.

The latter presentation [22] is more in line with the

“original KS theorem” [19], as well as the presenta-

tion in Ref. [23]. Since normalization of probabili-

ties is thus presumed in Ref. [22], in keeping with the

deﬁnition of a probabilistic model we have presented

(following [23]), the graph invariants of CSW [22] re-

fer, speciﬁcally, to subgraphs G of those hypergraphs

Γ on which the set of KS-noncontextual probabilistic

models is non-empty. In particular, our generaliza-

tion of the CSW framework [22] in this paper says

nothing about noise-robust noncontextuality inequal-

ities from logical proofs of the Kochen-Specker the-

orem [19], which rely on hypergraphs Γ that admit

no KS-noncontextual probabilistic models, i.e., KS-

uncolourable hypergraphs. It also says nothing for

the hypergraphs Γ that do not satisfy the property

(Γ) = G(Γ). An example of such a hypergraph,

which is not covered by our generalization of the CSW

framework on both counts, is the 18 ray hypergraph

ﬁrst presented in Ref. [51], denoted Γ

(see Fig. 4 and

Appendix D). Indeed, the study of noise-robust non-

contextuality inequalities from such KS-uncolourable

hypergraphs was initiated in Ref. [12], and a more ex-

haustive hypergraph-theoretic treatment of it is pre-

sented in Ref. [34]. In this paper, we will restrict

ourselves to KS-colourable hypergraphs, the study

of which was initiated in Ref. [16], and, of these,

only those KS-colourable hypergraphs Γ which satisfy

(Γ) = G(Γ). Note that this is not a limitation of

our general approach, which is based on Ref. [16] and

applies to any KS-colourable hypergraph, but rather a

limitation we inherit from the CSW framework [22]

Ref. [22] takes Specker’s principle to be fundamental and

identiﬁes CE

(Γ

) as the most general set of probabilistic mod-

since we want to leverage their graph invariants in

obtaining our noise-robust noncontextuality inequali-

ties. The study of other KS-colourable hypergraphs,

in particular those which arise only with nonprojective

measurements in quantum theory [39–41] and are out-

side the scope of traditional frameworks [22, 23, 25],

will be taken up in future work.

To summarize, the measurement events hyper-

graphs Γ where the present framework (and the CSW

framework [22]) applies must satisfy two properties:

C(Γ) 6= ∅ (that is, KS-colourability) and CE

(Γ) =

G(Γ).

In the next subsection, we deﬁne additional notions

necessary to obtain noise-robust noncontextuality in-

equalities that make use of graph invariants from the

CSW framework. These notions correspond to source

events that are an integral part of our framework.

3.2 Sources

Having introduced the (hyper)graph-theoretic ele-

ments that we need to talk about measurement

events, we are now in a position to introduce features

of source events that are relevant in the Spekkens

framework. This part of our framework has no prece-

dent in the literature on KS-noncontextuality, in par-

ticular the CSW framework [22]. We introduce these

source events in order to benchmark the measure-

ment events against them, i.e., for every measurement

event, we seek to identify in the operational theory

a corresponding source event that makes this mea-

surement event as likely as possible. This helps us

deal with cases where a measurement device may be

implementing very noisy measurements by explicitly

accounting for this noise in our noise-robust noncon-

textuality inequalities. Further, while we do not as-

sume outcome determinism (which is essential to KS-

noncontextuality), we will invoke preparation noncon-

textuality with respect to these source events in the

Spekkens framework [18]. As an example of what

we mean by “benchmarking” a measurement event

against a source event, consider the case of quantum

els, which is not the case for Γ

(for example). See Appendix

D for a detailed discussion of this point.

As we have shown, when the operational theory T un-

der consideration satisﬁes structural Specker’s principle, we

can always turn a hypergraph Γ that doesn’t satisfy structural

Specker’s principle into a hypergraph Γ

that satisﬁes it and for

which, therefore, CE

(Γ

) = G(Γ

) holds. This can be seen as

justiﬁcation for restricting oneself to probabilistic models sat-

isfying consistent exclusivity in the CSW framework [22]: such

a restriction is not really a restriction if the theory satisﬁes

structural Specker’s principle. On the other hand, we restrict

ourselves to hypergraphs for which CE

(Γ) = G(Γ) without as-

suming that T satisﬁes structural Specker’s principle. The jus-

tiﬁcation for this seemingly ad hoc restriction is simply that it

is necessary in order to meaningfully leverage the graph invari-

ants of CSW [22] – in particular, the fractional packing number

– in our noise-robust noncontextuality inequalities. This will

become clear when we obtain our noise-robust noncontextuality

inequalities.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 21

theory, where any measurement event represented by

a projector occurs with probability 1 for any source

event that is represented by an eigenstate of this

projector; on the other hand, a positive operator

that isn’t projective cannot occur with a probabil-

ity greater than its largest eigenvalue (< 1) for any

source event. We now proceed to describe the neces-

sary hypergraph-theoretic ingredients we need to ac-

commodate source events in our framework.

As we have argued previously, we require the mea-

surement events hypergraph Γ to be such that C(Γ) 6=

∅ and CE

(Γ) = G(Γ) to be able to obtain noise-

robust noncontextuality inequalities that use graph

invariants from the CSW framework [22]. Hence, we

will restrict ourselves to experiments that realize the

operational equivalences represented by this class of

Γ. Now, in the CSW framework [22], every Bell-KS

expression picks out a particular subgraph G of the

orthogonality graph O(Γ) of the contextuality sce-

nario Γ of interest. This amounts to focussing on

a restricted set of probabilities (for the vertices of

G) rather than probabilities for all the measurement

events (represented by vertices of Γ) in the experi-

ment. Hence, the vertices of G denote the measure-

ment events of interest in a given Bell-KS expression

and we have the following:

• A general probabilistic model p ∈ G(Γ) will

assign probabilities to vertices in G such that:

p(v) ≥ 0 for all v ∈ V (G) and p(v) + p(v

) ≤ 1

for every edge {v, v

} ∈ E(G).

• A probabilistic model p ∈ CE

(Γ) will assign

probabilities to vertices in G such that: p(v) ≥ 0

for all v ∈ V (G) and

v∈c

p(v) ≤ 1, (44)

for every clique c ⊆ V (G).

• A probabilistic model p ∈ C(Γ) will assign prob-

abilities to vertices in G such that: p(v) =

Pr(k)p

(v), where Pr(k) ≥ 0,

Pr(k) = 1,

and for each k, p

is a deterministic assign-

ment p

(v) ∈ {0, 1} for all v ∈ V (G), and

(v) + p

) ≤ 1 for every edge {v, v

} ∈ E(G).

Since Γ is such that CE

(Γ) = G(Γ), the condition

v∈c

p(v) ≤ 1 for every clique c ⊆ V (G)

on the probabilities assigned to vertices in G is redun-

dant. We now obtain a simpliﬁed hypergraph, Γ

from G as follows: convert all maximal cliques in G

to hyperedges and add an extra (no-detection) vertex

to each such hyperedge.

Physically, a “no-detection” vertex denotes the case when

none of the measurement events of interest (here, the events in

G) for a given measurement setting occur.

Figure 5: The hypergraph Γ

obtained from G by adding a

no-detection vertex (represented by a hollow circle) to every

maximal clique in G.

This Γ

, for any G, will satisfy the prop-

erty that CE

(Γ

) = G(Γ

) and any probabilis-

tic model on Γ assigning probabilities to measure-

ment events in G will correspond to a probabilis-

tic model on Γ

which also assigns the same prob-

abilities to measurement events in G. Formally:

V (Γ

) ≡ V (G)

|c is a maximal clique in G},

and E(Γ

) ≡ {c t {v

}|c is a maximal clique in G},

where v

is the extra no-detection vertex added to

the hyperedge corresponding to maximal clique c in

We have the following probabilistic model on Γ

given a probabilistic model p ∈ G(Γ): the probabili-

ties assigned to the vertices in V (G) ⊆ V (Γ

) are the

same as speciﬁed by p ∈ G(Γ) and the probabilities as-

signed to the remaining vertices in V (Γ

)\V (G) are

given by p(v

) = 1 −

v∈c

p(v), for every maximal

clique c in G. Consider, for example, the KCBS sce-

nario [16, 22, 47]: the 20-vertex Γ representing mea-

surement events from ﬁve 4-outcome joint measure-

ments (Fig. 2), its 5 vertices G involved in the KCBS

inequality (Fig. 3), and 10-vertex hypergraph Γ

con-

structed from G (Fig. 5).

Given Γ

, constructed from G, we now require

that the operational theory that realizes measure-

ment events in Γ

also admits preparations that can

be represented by a hypergraph Σ

of source events

as follows: for every hyperedge e ∈ E(Γ

), corre-

sponding to the choice of measurement setting M

we deﬁne a hyperedge e ∈ E(Σ

) denoting a cor-

responding choice of source setting S

. And for

every vertex v ∈ e(∈ E(Γ

)), we deﬁne a vertex

∈ e(∈ E(Σ

)).

Hence, every measurement event

Recall from the discussion at the beginning of Section 3.2

that we seek to benchmark the measurement events against

those source events in the operational theory that (ideally)

make them as predictable as possible. The source setting

against which the predictability of a particular measurement

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 22

Figure 6: The source events hypergraph with the operational

equivalences between the source settings separately speciﬁed.

[v|e] in Γ

corresponds to a vertex v

of Σ

, and the

number of such vertices in V (Σ

) is |V (Γ

)||E(Γ

)|.

This means that the operational equivalences between

the measurement events that are implicit in Γ

—

such as [v|e] is operationally equivalent to [v|e

], where

e, e

∈ E(Γ

) are distinct hyperedges that share the

vertex (representing an equivalence class of measure-

ment events) v ∈ V (Γ

) — are not carried over to the

source events, where none is presumed to be opera-

tionally equivalent to any other, hence v

∈ V (Σ

) is

a diﬀerent vertex from v

∈ V (Σ

). Here v

) rep-

resents a source event [s

] ([s

]), rather than

an equivalence class of source events.

Besides these |V (Γ

)||E(Γ

)| vertices in V (Σ

)

and the associated hyperedges e ∈ E(Σ

), we require

that the operational theory admits an additional hy-

peredge e

∗

∈ E(Σ

), representing a source setting

∗

, containing two new vertices v

∗

, v

∗

∈ V (Σ

Here v

∗

represents the source event [s

∗

= 0|S

∗

]

and v

∗

represents the source event [s

∗

= 1|S

∗

Hence, we have |V (Σ

)| = |V (Γ

)||E(Γ

)| + 2 and

|E(Σ

)| = |E(Γ

)| + 1.

The operational equivalence we do require for Σ

(in any operational theory that admits source events

represented by Σ

) applies to the source settings: all

source settings, each represented by coarse-graining

the source events in a hyperedge e ∈ E(Σ

), are op-

erationally equivalent, i.e., [>|S

] ' [>|S

] for all

e, e

∈ E(Σ

), i.e., ∀[m|M] :

p(m, s

|M, S

) =

p(m, s

|M, S

), for all e, e

∈ E(Σ

An example of such a source events hypergraph was

considered in Ref. [12], albeit without the additional

setting is tested – that is the predictability of each measure-

ment event (e.g., v ∈ e(∈ E(Γ

))) for this measurement set-

ting (e.g., M

) is benchmarked against some source event (e.g.,

∈ e(∈ E(Σ

))) for the source setting (e.g., S

) – is the

“corresponding choice of source setting S

”. In Section 5.2 we

will see how these pairs of source and measurement settings

are used to compute an operational quantity relevant for our

noise-robust noncontextuality inequalities.

source labelled by e

∗

here [16]. We illustrate it here

in Fig. 6 for the KCBS scenario.

4 A key hypergraph invariant: the

weighted max-predictability

We now deﬁne a hypergraph invariant that will be rel-

evant for our noise-robust noncontextuality inequali-

ties:

β(Γ

, q) ≡ max

p∈G(Γ

ind

e∈E(Γ

)

ζ(M

, p), (45)

where q

≥ 0 for all e ∈ E(Γ

e∈E(Γ

)

= 1, and

ζ(M

, p) ≡ max

v∈e

p(v)

is the maximum probability assigned to a vertex in e ∈

E(Γ

) by an extremal indeterministic probabilistic

model p ∈ G(Γ

ind

We call β(Γ

, q) the weighted max-predictability of

the measurement settings (i.e., hyperedges) in Γ

where the hyperedges e ∈ E(Γ

) are weighted accord-

ing to the probability distribution q ≡ {q

}

e∈E(Γ

)

We now outline how this quantity is related to prop-

erties of an operational theory T admitting a mea-

surement noncontextual ontological model. Γ

repre-

sents a particular conﬁguration of operational equiv-

alences that a set of measurement events in T may

realize. The probabilistic models on Γ

that can be

realized by T are, as earlier, denoted by T(Γ

). Since

T admits a measurement noncontextual ontological

model,

its predictions for the speciﬁc case of Γ

can be reproduced by such a model. But since, in

keeping with the CSW approach [22], we will look at

witnesses of contextuality tailored to particular ex-

periments (Γ

representing features of one such ex-

periment), we do not need an ontological model for

the full theory T to reproduce its predictions for a

particular experiment. Indeed, to construct a mea-

surement noncontextual ontological model for the set

of probabilistic models T(Γ

), it suﬃces to assume

(without loss of generality) that the extremal proba-

bilistic models on Γ

– given by G(Γ

det

tG(Γ

ind

An extremal indeterministic probabilistic model refers to

those extremal p ∈ G(Γ

) for which ζ(M

, p) < 1 for some

e ∈ E(Γ

This will always be the case for any operational theory we

consider: the assumption of measurement noncontextuality on

its own can always be satisﬁed by a trivial ontological model

of the type we outlined in Section 2.5. Indeed, quantum the-

ory satisﬁes it, the Beltrametti-Bugajski model [52] that was

discussed in Ref. [18] being an example of a measurement non-

contextual ontological model of quantum theory. It is only

when this assumption is supplemented with something else –

outcome determinism in the case of KS-noncontextuality and

preparation noncontextuality in the case of generalized noncon-

textuality [18] – that it can produce a contradiction with the

predictions of an operational theory.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 23

– are in bijective correspondence with the ontic states

(Λ) of the physical system on which the measure-

ments are carried out. This is because, ﬁrstly, any

probabilistic model in G(Γ

) can be expressed as a

convex mixture of extremal probabilistic models in

G(Γ

det

tG(Γ

ind

, and, secondly, associating each

ontic state in the ontological model with an extremal

probabilistic model

in G(Γ

det

tG(Γ

ind

means

that any probabilistic model in G(Γ

) corresponding

to predictions of an operational theory (in particular,

any p ∈ T(Γ

) ⊆ G(Γ

)) can be obtained by an ap-

propriate probability distribution over this set of ontic

states. Denoting the set of ontic states corresponding

to G(Γ

det

by Λ

det

and the set of ontic states cor-

responding to G(Γ

ind

by Λ

ind

, we have that the

measurement noncontextual ontological model given

by Λ ≡ Λ

det

tΛ

ind

reproduces the predictions T(Γ

)

of any operational theory T that admits a measure-

ment noncontextual ontological model: that is, for

every p ∈ G(Γ

) (and therefore also p ∈ T(Γ

)),

p(v) =

λ∈Λ

ξ(v|λ)µ(λ)

for all v ∈ V (Γ

), for some probability distribution

µ : Λ → [0, 1] such that

λ∈Λ

µ(λ) = 1.

We can

also then rewrite β(Γ

, q) as

β(Γ

, q) = max

λ∈Λ

ind

e∈E(Γ

)

ζ(M

, λ), (46)

where ζ(M

, λ) ≡ max

ξ(m

, λ).

5 Noise-robust noncontextuality in-

equalities

We will now proceed to obtain our noise-robust non-

contextuality inequalities following the ideas outlined

in Ref. [16].

5.1 Key notions from CSW

We ﬁrst recall some key notions from the CSW frame-

work [22] before obtaining our inequalities.

Consider the positive linear combination of the

probabilities of measurement events,

R([s|S]) ≡

v∈V (G)

p(v|S, s), (47)

Representing response functions for the ontic state, i.e.,

p(v) = ξ(v|λ), ∀v ∈ V (Γ

)

As a corollary, note that as long as the polytope G(Γ

) has

a ﬁnite number of extreme points, we can take the ontic state

space to consist of a ﬁnite number of ontic states (as we have

done) without any loss of generality. The hypergraphs Γ

study – representing the measurement events of interest in a

contextuality experiment – have this property because of their

ﬁniteness.

where w

> 0 for all v ∈ V (G).

The fundamental result of CSW is that this quan-

tity is bounded for diﬀerent sets of correlations — KS-

noncontextual, those realizable by projective quan-

tum measurements, and those satisfying consistent

exclusivity — by graph-theoretic invariants as follows:

∀[s|S] : R([s|S])

≤ α(G, w)

≤ θ(G, w)

≤ α

∗

(G, w),

(48)

where KS denotes operational theories that admit

KS-noncontextual ontological models and thus realize

probabilistic models on Γ

that fall in the set C(Γ

Q denotes quantum theory with projective measure-

ments which assigns probabilistic models on Γ

de-

noted by Q(Γ

), and CE

denotes operational theo-

ries satisfying consistent exclusivity and thus realiz-

ing the set of probabilistic models CE

(Γ

) on Γ

The graph invariants of the weighted graph (G, w),

namely, α(G, w), θ(G, w), and α

∗

(G, w) are deﬁned

as follows:

1. Independence number α(G, w):

α(G, w) ≡ max

v∈I

, (49)

where I ⊆ V (G) is an independent set of vertices

of G, i.e., a set of nonadjacent vertices of G, so

that none of the vertices in this set shares an edge

with any other vertex in the set.

2. Lovasz theta number θ(G, w):

θ(G, w) ≡ max

{|u

v∈V (G)

,|ψi

v∈V (G)

|hψ|u

(50)

where {|u

v∈V (G)

= {|u

v∈V (

(each |u

i a

unit vector in R

) is an orthonormal representa-

tion (OR) of the complement of G, namely,

and the unit vector |ψi ∈ R

is called a handle.

Here V (

G) ≡ V (G) and E(

G) ≡ {(v, v

)|v, v

∈

V (G), (v, v

) /∈ E(G)}, and we have in an or-

thonormal representation that hu

000

i = 0 for

all pairs of nonadjacent vertices, (v

, v

000

), in

or equivalently, for all (v

, v

000

) ∈ E(G).

3. Fractional packing number α

∗

(G, w):

∗

(G, w) ≡ max

}

v∈V (G)

, (51)

where {p

}

v∈V (G)

is such that p

≥ 0 for all v ∈

V (G) and

v∈c

≤ 1 for all cliques c in G.

Note that since we are always considering Γ

such

that CE

(Γ

) = G(Γ

), we, in fact, have the bounds

∀[s|S] : R([s|S])

≤ α(G, w)

≤ θ(G, w)

GPT

≤ α

∗

(G, w),

(52)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 24

where “GPT” denotes the full set of probabilistic

models on Γ

, i.e., G(Γ

In terms of the notation we have already intro-

duced, where R([s|S]) ≤ R

was a Bell-KS in-

equality, we now have — from CSW [22] — that

= α(G, w).

5.2 Key notion not from CSW:

source-measurement correlation, Corr

We need to deﬁne a new quantity not in the CSW

framework, namely,

Corr ≡

e∈E(Γ

)

p(m

, s

, S

), (53)

where {q

}

e∈E(Γ

)

is a probability distribution, i.e.,

≥ 0 for all e ∈ E(Γ

) and

e∈E(Γ

)

= 1,

such that β(Γ

, q) < 1 holds.

In previous work

[12, 16], we have taken q to be the uniform distribution

|E(Γ

, but the derivation of the noncontextual-

ity inequalities is independent of that choice (as we’ll

see here). Also, note that we have chosen the following

labelling convention for outcomes of source setting S

(namely, s

) and measurement setting M

(namely,

): the source outcomes s

for source setting S

take values in the same set as measurement outcomes

for measurement setting M

, i.e., V

= V

(re-

calling notation from Section 2). In particular, out-

comes corresponding to the measurement event [v|e]

(representing [m

]) and its corresponding source

event v

(representing [s

]) are both denoted by

the same label, so that m

= s

for them. An exam-

ple of this from Figs. 5 and 6 would be to, say, denote

the outcomes of a particular e ∈ E(Γ

) (measurement

setting M

) by m

∈ V

≡ {0, 1, 2} and correspond-

ing outcomes of e ∈ E(Σ

) (source setting S

) by

∈ V

≡ {0, 1, 2}; so if [v|e] denotes [m

= 0|M

then v

will denote [s

= 0|S

], etc.

5.3 Obtaining the noise-robust noncontextual-

ity inequalities

5.3.1 Expressing operational quantities in ontological

terms

We begin with expressing the operational quantities of

interest in terms of a noncontextual ontological model.

In an ontological model, R([s|S]) is given by

R([s|S]) =

λ∈Λ

v∈V (G)

p(v|λ)µ(λ|S, s). (54)

Deﬁning R(λ) ≡

v∈V (G)

p(v|λ), we have that

R([s|S]) =

λ∈Λ

R(λ)µ(λ|S, s). (55)

Indeed, for the strongest possible constraint on Corr, one

must pick q such that β(Γ

, q) is minimized.

Similarly, Corr is given by

Corr

λ∈Λ

e∈E(Γ

)

ξ(m

, λ)µ(λ, s

)

λ∈Λ

e∈E(Γ

)

ξ(m

, λ)µ(s

, λ)µ(λ|S

(56)

Here, we have used the fact that

µ(λ, s

) = µ(s

, λ)µ(λ|S

)

to express Corr in a way that treats sources and mea-

surements similarly.

Using preparation noncontextuality (cf. Eq. (22)),

we have that

∀e, e

∈ E(Σ

) : [>|S

] ' [>|S

]

⇒ µ(λ|S

) = µ(λ|S

) ≡ ν(λ), ∀λ ∈ Λ. (57)

Then we can rewrite Corr as

Corr

λ∈Λ

e∈E(Γ

)

ξ(m

, λ)µ(s

, λ)ν(λ).

(58)

Note that the only λ that contribute to Corr are

those for which ν(λ) > 0. Also, µ(s

, λ) and

µ(λ|S

, s

) satisfy the condition µ(s

, λ)ν(λ) =

µ(λ|S

, s

)p(s

), so that µ(s

, λ) is well-deﬁned

whenever ν(λ) > 0.

Deﬁning

Corr(λ) ≡

e∈E(Γ

)

ξ(m

, λ)µ(s

, λ),

(59)

we have that

Corr =

λ∈Λ

Corr(λ)ν(λ), (60)

Recalling that ζ(M

, λ) = max

ξ(m

, λ),

note that Corr(λ) is upper bounded as follows (for

any λ ∈ Λ):

Corr(λ)

≡

e∈E(Γ

)

ξ(m

, λ)µ(s

, λ)

≤

e∈E(Γ

)

ζ(M

, λ)

µ(s

, λ)

e∈E(Γ

)

ζ(M

, λ). (61)

If λ ∈ Λ

det

, then this upper bound is trivial, i.e.,

Corr(λ) ≤ 1, since every measurement has determin-

istic response functions. On the other hand, for all

λ ∈ Λ

ind

, we have (from Eq. (46))

Corr(λ) ≤ β(Γ

, q). (62)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 25

Similarly, for λ ∈ Λ

det

we have R(λ) ≤ α(G, w), while

for λ ∈ Λ

ind

we have R(λ) ≤ α

∗

(G, w).

Using the fact that

ν(λ) = µ(λ|S) =

µ(λ|S, s)p(s|S),

for any S ≡ S

, e ∈ E(Σ

), we have

Corr

Corr(λ)µ(λ|S, s)

p(s|S)

Corr

p(s|S). (63)

where we have deﬁned Corr

≡

Corr(λ)µ(λ|S, s).

5.3.2 Derivation of the noncontextual tradeoﬀ for any

graph G

We are now in a position to express our general noise-

robust noncontextuality inequality as a tradeoﬀ be-

tween three operational quantities: Corr, R([s

∗

0|S

∗

]), and p(s

∗

= 0|S

∗

First, note that KS-contextuality is witnessed when

for some choice of [s|S], here given by [s

∗

= 0|S

∗

we have

R([s

∗

= 0|S

∗

]) > α(G, w).

This means that for some set of ontic states in the

support of [s

∗

= 0|S

∗

], i.e.,

λ ∈ Supp{µ(.|S

∗

, s

∗

= 0)}

≡ {λ ∈ Λ : µ(λ|S

∗

, s

∗

= 0) > 0}, (64)

we have R(λ) > α(G, w). For such a set of ontic

states one must then have Corr(λ) < 1 (because these

λ ∈ Λ

ind

and we have Eq. (62)), which in turn implies

that Corr

∗

< 1. On the other hand, for s

∗

= 1,

we have no constraints: Corr

∗

≤ 1. Thus,

Corr

= Corr

∗

p(s

∗

= 0|S

∗

) + Corr

∗

p(s

∗

= 1|S

∗

)

≤ p

Corr

∗

+ 1 − p

, (65)

where p

≡ p(s

∗

= 0|S

∗

Deﬁning µ

det

≡

λ∈Λ

det

µ(λ|S

∗

, s

∗

= 0) and

ind

≡

λ∈Λ

ind

µ(λ|S

∗

, s

∗

= 0), we now have

det

+ µ

ind

= 1, (66)

Corr

∗

≤ µ

det

+ β(Γ

, q)µ

ind

, (67)

R ≤ α(G, w)µ

det

+ α

∗

(G, w)µ

ind

. (68)

Note that assuming µ

det

= 1 would reduce these

constraints to a standard Bell-KS inequality, R ≤

α(G, w). However, since we are not assuming this,

simply eliminating µ

det

and µ

ind

from these con-

straints leads us to

Corr

∗

≤ 1 − (1 −β(Γ

, q))

R − α(G, w)

∗

(G, w) − α(G, w)

(69)

where the upper bound is nontrivial if and only if

β(Γ

, q) < 1 and R − α(G, w) > 0.

If we are given that β(Γ

, q) < 1, then we have a

trivial upper bound on Corr

∗

for the remaining

cases: the upper bound is 1 for R = α(G, w) and

greater than 1 for R < α(G, w).

Thus, our noise-robust noncontextuality inequality

now reads:

Corr ≤ 1−p

(1−β(Γ

, q))

R − α(G, w)

∗

(G, w) − α(G, w)

, (70)

which can be rewritten as

R ≤ α(G, w) +

∗

(G, w) − α(G, w)

1 −Corr

1 −β(Γ

, q)

(71)

Note that Eq. (70) expresses the constraint from

noncontextuality as an upper bound on the source-

measurement correlations Corr, reminiscent of the

noise-robust noncontextuality inequality ﬁrst derived

in Ref. [12] (and later treated in hypergraph-theoretic

terms in Ref. [34]), except here the upper bound

on Corr depends not only on the hypergraph in-

variant β(Γ

, q) but also two of the graph invari-

ants from the CSW framework [22], namely, α(G, w)

and α

∗

(G, w), besides also the operational quantity

R, which is the ﬁgure-of-merit for KS-contextuality

(R > α(G, w) witnesses KS-contextuality) in the

CSW framework. Eq. (70) indicates that the source-

measurement correlations would fail to be perfect

(i.e., Corr < 1) in an operational theory admit-

ting a noncontextual ontological model if and only if

R > α(G, w) and β(Γ

, q) < 1. Contextuality would

be witnessed when the source-measurement correla-

tions are stronger than the constraint from Eq. (70).

For R ≤ α(G, w), in particular, there is no constraint

from noncontextuality on Corr.

On the other hand, rewriting the constraint from

noncontextuality as Eq. (71), one is reminded of the

CSW framework [22], where R is taken to be the quan-

tity that is upper bounded by KS-noncontextuality.

Here, instead, we have that R is upper bounded by

a term that includes the source-measurement correla-

tions Corr that can be achieved for the measurements

and thus penalizes for measurements that cannot be

made highly predictable with respect to some prepa-

rations, i.e., Corr < 1 makes it harder to violate the

To see this explicitly, just use Eq. (66) to make the substi-

tution µ

ind

= 1−µ

det

in Eqs. (67) and (68), then eliminate µ

det

from Eq. (67) by using the upper bound on it from Eq. (68).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 26

upper bound on R. When the upper bound reaches

∗

(G, w), it becomes trivial and R is no longer con-

strained by noncontextuality on account of noise in

the measurements. Indeed, trivial POVMs (cf. Ap-

pendices A.1.2 and C) never violate such a noncon-

textuality inequality because of the penalty incurred

via Corr, as we later show in Section 6.3.

5.3.3 When is the noncontextual tradeoﬀ violated?

The inequality of Eq. (71) can be rewritten as the

following tradeoﬀ between Corr, p

, and R:

Corr+p

(1−β(Γ

, q))

R − α(G, w)

∗

(G, w) − α(G, w)

≤ 1. (72)

Writing the constraint from noncontextuality in the

form of Eq. (72) (in contrast to Eqs. (70) and (71))

makes it more even-handed in its treatment of the

two operational quantities R (which is key in the

CSW framework [22]) and Corr (which is key in

noise-robust noncontextuality inequalities inspired by

logical proofs of the KS theorem [12, 34]) and em-

phasizes that noise-robust noncontextuality inequal-

ities inspired by statistical proofs of the KS theo-

rem [16] are tradeoﬀs between R (which is about

the strength of correlations between measurements)

and Corr (which is about the predictability of mea-

surements) that must be satisﬁed by any operational

theory admitting a noncontextual ontological model.

Roughly speaking, a high degree of predictability for

measurements (e.g., Corr = 1) cannot coexist with

very strong correlations between the measurements

(e.g., R = α

∗

(G, w)) when the operational theory ad-

mits a noncontextual ontological model.

For a nontrivial constraint – and hence, the pos-

sibility of witnessing contextuality via violation of

this inequality (Eq. (72)) – the upper bound on Corr

(the right-hand-side of Eq. (70)) should be strictly

bounded above by 1, and the upper bound on R

(the right-hand-side of Eq. (71)) should be strictly

bounded above by α

∗

(G, w) (the algebraic upper

bound on R), that is

> 0 and β(Γ

, q) < 1,

R > α(G, w),

Corr > 1 − p

(1 −β(Γ

, q)). (73)

These are the minimal benchmarks necessary — be-

sides the requirement of tomographic completeness of

a ﬁnite set of procedures and the possibility of in-

ferring secondary procedures with exact operational

equivalences using convexity of the operational theory

[13] — to witness contextuality in a Kochen-Specker

type experiment adapted to our framework following

Spekkens [18].

Suppose one achieves, by some means, a value

of R = θ(G, w), the upper bound on the quantum

value with projective measurements. When would

this value be an evidence of contextuality? For this

to be the case, we must have:

Corr > 1−p

(1−β(Γ

, q))

θ(G, w) −α(G, w)

∗

(G, w) − α(G, w)

. (74)

Now, for the ideal quantum realization where mea-

surement events are projectors, and the corresponding

source events are eigenstates, it is always the case that

Corr = 1, hence contextuality is witnessed. However,

it’s possible to witness contextuality even if Corr < 1,

as long as it exceeds the lower bound speciﬁed above.

In a sense, for quantum theory, this allows for a quan-

titative accounting of the eﬀect of nonprojectiveness

in the measurements (or mixedness in preparations)

on the possibility of witnessing contextuality, a fea-

ture that is absent in traditional Kochen-Specker ap-

proaches [21–23, 25]. Indeed, as long as one achieves

any value of R > α(G, w), it is possible to witness

contextuality for a suﬃciently high value of Corr (see

Eq. (70)).

5.4 Example: KCBS scenario

We will now illustrate our hypergraph framework by

applying it to the KCBS scenario to make diﬀerences

with respect to the CSW graph-theoretic framework

[22] explicit.

The graph G for the KCBS scenario is given in

Fig. 3, the measurement events hypergraph Γ

given in Fig. 5, and the source events hypergraph Σ

is given in Fig. 6. We then have

R([s|S]) =

v∈V (G)

p(v|S, s), (75)

where the (vertex) weights w

= 1 for all v ∈ V (G),

i.e., it’s an unweighted graph and we will use α(G)

and α

∗

(G) to denote its independence number and

the fractional packing number, respectively. These

are given by

α(G) = 2 and α

∗

(G) = 5/2. (76)

The source-measurement correlation term is given by

Corr =

e∈E(Γ

)

p(m

, s

, S

) (77)

for any choice of probability distribution q ≡

}

e∈E(Γ

)

. For simplicity, we will just take this

probability distribution to be uniform, i.e., q

for all e ∈ E(Γ

). Note that the only extremal prob-

abilistic model on Γ

corresponding to an indeter-

ministic assignment (in Λ

ind

) assigns ξ(v|λ) =

for

all v ∈ V (G). This means

β(Γ

, q) =

∀q. (78)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 27

Figure 7: Geometric conﬁguration of the vectors appearing

in the KCBS construction [47].

The noncontextuality inequality of Eq. (71)

R ≤ α(G, w) +

∗

(G, w) − α(G, w)

1 −Corr

1 −β(Γ

, q)

(79)

then becomes (in the KCBS scenario)

R ≤ 2 +

1/2

1 −Corr

1/2

, (80)

R ≤ 2 +

1 −Corr

. (81)

Recall that the KCBS inequality [22, 47] reads R ≤ 2

and it would be a valid noncontextuality inequality

in our framework if and only if one can ﬁnd mea-

surements and preparations such that Corr = 1.

In the standard KCBS construction [47] that vio-

lates the inequality R ≤ 2, we have the ﬁve ver-

tices in G (say v

, i ∈ {1, 2, 3, 4, 5}, labelled cycli-

cally) associated with ﬁve projectors Π

= |l

ihl

i ∈ {1, 2, 3, 4, 5}, on a qutrit Hilbert space, given

by the vectors |l

i = (sin θ cos φ

, sin θ sin φ

, cos θ),

4πi

, and cos θ =

√

. The special source event

∗

= 0|S

∗

] is associated with the quantum state

|ψi = (0, 0, 1), so that

R =

i=1

|hl

|ψi|

√

5 > 2. (82)

See Fig. 7 for a depiction of the geometric conﬁgura-

tion of these vectors.

To turn this KCBS construction into an argument

against noncontextuality in our approach, we need

additional ingredients beyond the graph G. Firstly,

for both the measurement events hypergraph Γ

and

the source events hypergraph Σ

, we denote the hy-

peredges by e

, i ∈ {1, 2, 3, 4, 5}. In Γ

, the mea-

surement events for the setting M

are given by

{[m

= 0|M

] = |l

ihl

|, [m

= 1|M

] = I −|l

ihl

|−

i+1

ihl

i+1

|, [m

= 2|M

] = |l

i+1

ihl

i+1

|}, where for

i = 5, i + 1 = 1 (addition modulo 5). Similarly,

in Σ

, the source events corresponding to source

setting S

are given by {[s

= 0|S

] = |l

ihl

= 2|S

] = |l

i+1

ihl

i+1

|, and [s

= 1|S

] =

I −|l

ihl

|−|l

i+1

ihl

i+1

|}, where p(s

= b|S

) =

for

all b ∈ {0, 1, 2}. The special source setting S

∗

con-

sists of source events {[s

∗

= 0|S

∗

] = |ψihψ|, [s

∗

1|S

∗

] =

I−|ψihψ|

}, where p(s

∗

= 0|S

∗

) =

and

p(s

∗

= 1|S

∗

) =

. We thus have the operational

equivalences we need between the source settings:

[>|S

] ' [>|S

] =

, ∀e, e

∈ E(Σ

). (83)

This choice of representation for Γ

and Σ

yields

, Corr = 1, and R([s

∗

= 0|S

∗

]) =

√

5, so that

the inequality

R ≤ 2 +

1 −Corr

(84)

is violated. However, note that this is an idealization

(under which Corr = 1) and, typically, the source

events and measurement events will not be perfectly

correlated (Corr < 1) and the operational equiva-

lences between the source settings need not corre-

spond to the maximally mixed state. All that is re-

quired for a test of noncontextuality using this in-

equality is that the operational equivalences hold for

some choice of preparations and measurements which

need not be the same as that in the ideal KCBS con-

struction.

To illustrate what happens when Corr < 1, we con-

sider the eﬀect of a depolarizing channel on the states

and measurements in the ideal KCBS construction.

The channel is given by

(·) = rI(·)I + (1 − r)

ITr(·), r ∈ [0, 1]. (85)

The action of this channel – with parameter r

∈

[0, 1], say – on the pure states {{|l

ihl

i=1

, |ψihψ|}

yields the noisy states given by

(|l

ihl

|) = r

ihl

| + (1 − r

)

, ∀i ∈ [5], (86)

(|ψihψ|) = r

|ψihψ| + (1 −r

)

, (87)

and the action of its adjoint – with parameter r

∈

[0, 1], say – on the ideal projectors, {|l

ihl

i=1

, in-

volved in the measurements correspondingly yields

the POVM elements given by

†

(|l

ihl

|) = r

ihl

| + (1 − r

)

, ∀i ∈ [5]. (88)

Hence, we are imagining a situation where the prepa-

ration procedures are aﬀected by depolarizing noise

with parameter r

and measurement procedures are

aﬀected by depolarizing noise with parameter r

, sim-

ilar to the situation considered previously in Section

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 28

II of the Supplemental material of Ref. [12]. The op-

erational equivalences required for our argument from

preparation and measurement noncontextuality are

satisﬁed by these noisy preparations and measure-

ments. That is, in Γ

, we can represent the mea-

surement events for the setting M

(where i ∈ [5],

i + 1 = 1, i.e., addition modulo 5) by

= 0|M

] = D

†

(|l

ihl

|), (89)

= 1|M

] = D

†

(I −|l

ihl

| −|l

i+1

ihl

i+1

|), (90)

= 2|M

] = D

†

(|l

i+1

ihl

i+1

|). (91)

It is easy to verify that these form elements of a valid

POVM denoted by the measurement setting M

and

that the operational equivalences between the mea-

surement events (represented by Γ

) are indeed re-

spected. On the other hand, in Σ

, the source events

corresponding to source setting S

can be represented

= 0|S

] = D

(|l

ihl

|), (92)

= 1|S

] = D

(I −|l

ihl

| −|l

i+1

ihl

i+1

|}) (93)

= 2|S

] = D

(|l

i+1

ihl

i+1

|), (94)

where p(s

= b|S

) =

for all b ∈ {0, 1, 2}, while the

source events for source setting S

∗

can be represented

∗

= 0|S

∗

] = D

(|ψihψ|), (95)

∗

= 1|S

∗

] = D



I −|ψihψ|



, (96)

where p(s

∗

= 0|S

∗

) =

and p(s

∗

= 1|S

∗

) =

These satisfy the operational equivalences

[>|S

] ' [>|S

] =

, ∀e, e

∈ E(Σ

). (97)

We then have

Corr =

e∈E(Γ

)

b∈{0,1,2}

p(m

= b|M

, S

, s

= b).

(98)

Noting that for any qutrit pure state |φi and its corre-

sponding projector |φihφ|, each aﬀected by depolariz-

ing noise with parameters r

and r

, respectively, we

have

Tr(D

(|φihφ|)D

†

(|φihφ|))

. (99)

Now, each term in the summation deﬁning Corr,

namely, p(m

= b|M

, S

, s

= b), is obtained from

a calculation of the type in Eq. (99). Hence, we have

for each such term,

p(m

= b|M

, S

, s

= b)

, (100)

so that

Corr =

. (101)

In the noiseless regime, i.e., r

= r

= 1, this reduces

to the ideal KCBS scenario. On the other hand, we

have

R([s

∗

= 0|S

∗

])

v∈V (G)

p(v|S

∗

, s

∗

= 0)

i=1



†

(|l

ihl

|)D

(|ψihψ|)



i=1



|hl

|ψi|

(1 −r

)

(1 −r

)

(1 −r

)(1 −r

)



i=1

|hl

|ψi|

(1 −r

). (102)

Recall that violation of the noncontextuality inequal-

ity requires that

R > 2 +

1 −Corr

. (103)

That is,

i=1

|hl

|ψi|

(1 −r

)

>2 + 3



1 −





. (104)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 29

Given that

i=1

|hl

|ψi|

√

5, this becomes

√

5 +

(1 −r

)

>2 + 2(1 − r

). (105)

Rewriting this, we obtain

> 1 −

√

5 −2

√

5 +

≈ 0.908, (106)

that is, the noncontextuality inequality can be vio-

lated only when the depolarizing noise is below a cer-

tain threshold given by r

> 0.908. In terms of

Corr, this requires Corr > 0.939. The noiseless case

= r

= 1 takes us back to the Corr = 1 regime that

we previously discussed.

6 Discussion

6.1 Measurement-measurement correlations

vs. source-measurement correlations

Note that the usual Kochen-Specker experiment, as

conceptualized in Refs. [21–23, 25], for example, in-

volves only the quantity R([s|S]), representing corre-

lations between various measurement events when all

the measurements are implemented on a system pre-

pared according to the same preparation procedure,

denoted by the source event [s|S]. Thus, R represents

measurement-measurement correlations on a system

prepared according to a ﬁxed choice of preparation

procedure.

On the other hand, the experiment we have concep-

tualized in this paper involves, besides the quantity

R, a quantity Corr representing source-measurement

correlations, characterizing the quality of the mea-

surements in terms of their response to corresponding

preparations.

Our noncontextuality inequalities represent a

trade-oﬀ relation that must hold between R and Corr

in an operational theory that admits a noncontextual

ontological model. Here we note that the ﬁrst exam-

ple of such a tradeoﬀ relation, albeit only for the case

of operational quantum theory with unsharp measure-

ments, appeared in Ref. [39] as the Liang-Spekkens-

Wiseman (LSW) inequality [40] which has been shown

to be experimentally violated in Ref. [53].

And, in-

deed, the developments reported in Ref. [16] and the

present paper have their origins in the idea of such a

trade-oﬀ relation that ﬁrst appeared in Ref. [39].

This experiment, however, is not in a position to make

claims about contextuality without presuming the operational

theory is quantum theory simply because the LSW inequality

presumes operational quantum theory. The noncontextuality

inequalities in this paper do not require the operational the-

ory to be quantum theory and can therefore be experimentally

tested using techniques from Refs. [13, 37, 54].

6.2 Can our noise-robust noncontextuality in-

equalities be saturated by a noncontextual on-

tological model?

A natural question concerns the tightness of these

noncontextuality inequalities, i.e., can Eq. (72) be sat-

urated by a noncontextual ontological model? This

requires one to specify a noncontextual ontological

model reproducing the operational equivalences be-

tween the measurement events and between the source

settings, such that

Corr + p

(1 −β(Γ

, q))

R − α(G, w)

∗

(G, w) − α(G, w)

= 1.

(107)

The assumption of measurement noncontextuality

is already implicit in our characterization of the re-

sponse functions ξ(m

, λ), and for this reason it

is, indeed, trivial to satisfy measurement noncontex-

tuality while saturating these noncontextuality in-

equalities. Measurement noncontextuality, alone, in

fact even allows a violation of the inequality (when

no preparation noncontextuality is imposed), the ex-

treme case being R = α

∗

(G, w) and 1 ≥ Corr >

1 − p

(1 − β(Γ

, q)). It’s the assumption of prepara-

tion noncontextuality that is nontrivial to satisfy and

we do not know if there exists a general construction

of a noncontextual ontological model saturating our

noncontextuality inequalities. We outline the general

situation below.

6.2.1 The special case of facet-deﬁning Bell-KS in-

equalities: Corr=1

If outcome determinism is presumed (as in traditional

Bell-KS type treatments), then we know that there

exists a necessary and suﬃcient set of Bell-KS in-

equalities (each corresponding to a particular choice

of R([s|S])) that are satisﬁed by any operational the-

ory admitting a KS-noncontextual ontological model.

In particular, each such (facet) Bell-KS inequality can

be saturated by KS-noncontextual ontological models

that yield probabilities (from G(Γ

)) corresponding

to the facet-deﬁning Bell-KS inequality, i.e., which

satisfy R([s|S]) = α(G, w) for such a Bell-KS in-

equality. Indeed, our noise-robust noncontextuality

inequalities corresponding to these choices of R([s|S])

(i.e., facet-deﬁning Bell-KS inequalities of the Bell-KS

polytope which is given by the convex hull of points in

G(Γ

det

) can always be saturated when Corr = 1,

because in that case outcome determinism is justiﬁed

by preparation noncontextuality (cf. Ref. [16]) and

our inequalities are identical to the Bell-KS inequali-

ties (saturated by R = α(G, w)).

6.2.2 The general case: Corr < 1

Since we do not want to assume outcome determin-

ism, nor necessarily the idealization of Corr = 1,

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 30

what is at stake here is the assumption of prepa-

ration noncontextuality. This assumption must be

satisﬁed while saturating the noise-robust noncontex-

tuality inequality in order for a measurement non-

contextual ontological model to be universally non-

contextual. Constructing such a noncontextual onto-

logical model amounts to specifying the distributions

µ(s

, λ) and ν(λ) such that

∀λ ∈ Λ : µ(λ|S

) = ν(λ), ∀e ∈ E(Σ

), (108)

i.e., preparation noncontextuality holds, and we have

(rewriting the saturation condition from Eq. (107))

(α

∗

(G, w) − α(G, w))Corr + p

(1 −β(Γ

, q))R

= (α

∗

(G, w) − α(G, w)) + p

α(G, w)(1 − β(Γ

, q)),

(109)

where

Corr =

∗

p(s

∗

)Corr

∗

, (110)

Corr

∗

λ∈Λ

Corr(λ)µ(λ|S

∗

, s

∗

), (111)

Corr(λ)

≡

e∈E(Γ

)

ξ(m

, λ)µ(s

, λ),

(112)

and

R =

λ∈Λ

R(λ)µ(λ|S

∗

, s

∗

= 0). (113)

Unfortunately, we do not have a general construction

that can show this to be possible for any noise-robust

noncontextuality inequality obtained according to the

approach we have outlined. We therefore leave it as

an open question whether such an inequality can (al-

ways?) be saturated by a noncontextual ontological

model.

6.3 Can trivial POVMs ever violate these non-

contextuality inequalities?

No.

Recall that a trivial POVM is deﬁned as an assign-

ment of positive operators p(v)I to the vertices of Γ

where I is the identity operator on some Hilbert space

and p : V (Γ

) → [0, 1], such that

v∈e

p(v) = 1 for

all e ∈ E(Γ

), is a probabilistic model on Γ

6.3.1 The case p ∈ C(Γ

)

Consider trivial POVMs corresponding to any KS-

noncontextual probabilistic model, i..e., p ∈ C(Γ

) is

a convex mixture of deterministic vertices, G(Γ

det

or equivalently, of ontic states in Λ

det

. In other

words, C(Γ

) ≡ ConvHull(G(Γ

det

), the convex

hull of points in G(Γ

det

. The largest value Corr

can take in this case is less than or equal to 1. This

means that the upper bound on R from our noncon-

textuality inequality, Eq. (71), will be greater than

or equal to α(G, w), whereas we know that for a

KS-noncontextual probabilistic model, R ≤ α(G, w).

Hence, there is no violation of our noncontextuality

inequality for such trivial POVMs.

6.3.2 The case p ∈ ConvHull(G(Γ

ind

)

Now consider trivial POVMs that correspond to the

indeterministic vertices, G(Γ

)

ind

(correspondingly,

ind

), or their convex mixtures. We know that for

these trivial POVMs, Corr ≤ β(Γ

, q). For any

R ≤ α

∗

(G, w) that is achieved by these trival POVMs,

our noncontextuality inequality reads

Corr ≤ 1 − p

(1 −β(Γ

, q))

R − α(G, w)

∗

(G, w) − α(G, w)

(114)

A suﬃcient condition for this inequality to be satisﬁed

is that

β(Γ

, q) ≤ 1 − p

(1 −β(Γ

, q))

R − α(G, w)

∗

(G, w) − α(G, w)

(115)

which reduces, for R > α(G, w), to

≤

∗

(G, w) − α(G, w)

R − α(G, w)

, (116)

where the upper bound is greater than or equal to

1, since α(G, w) < R ≤ α

∗

(G, w). This is trivially

satisﬁed since p

≤ 1.

For R < α(G, w), the suﬃcient condition of

Eq. (115) is again trivially satisﬁed since it reduces

≥ −

∗

(G, w) − α(G, w)

α(G, w) − R

, (117)

and we must anyway have p

≥ 0.

For R = α(G, w), the suﬃcient condition reduces

to β(Γ

, q) ≤ 1, which is again trivially satisﬁed since

β(Γ

, q) < 1 by deﬁnition.

6.3.3 The general case p ∈ G(Γ

)

In general, a probabilistic model achieved by trivial

POVMs can be in the convex hull of both determinis-

tic (Λ

det

) and indeterministic (Λ

ind

) ontic states, with

the total weight on deterministic ontic states denoted

by Pr(Λ

det

) and that on indeterministic ontic states

by Pr(Λ

ind

), so that Pr(Λ

det

)+Pr(Λ

ind

) = 1. We then

have

Corr ≤ Pr(Λ

det

) + Pr(Λ

ind

)β(Γ

, q),

R ≤ Pr(Λ

det

)α(G, w) + Pr(Λ

ind

)α

∗

(G, w).

(118)

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 31

A suﬃcient condition for satisfaction of the noncon-

textuality inequality is then

1 −Pr(Λ

ind

)(1 −β(Γ

, q))

≤ 1 − p

(1 −β(Γ

, q))

R − α(G, w)

∗

(G, w) − α(G, w)

(119)

which becomes

≤

∗

(G, w) − α(G, w)

R − α(G, w)

Pr(Λ

ind

) (120)

when R > α(G, w). Noting that

R ≤ α(G, w) + Pr(Λ

ind

)(α

∗

(G, w) − α(G, w)),

we have

Pr(Λ

ind

) ≥

R − α(G, w)

∗

(G, w) − α(G, w)

, (121)

so that the suﬃcient condition for satisfaction of the

noncontextuality inequality becomes p

≤ 1, which is

trivially satisﬁed.

When R = α(G, w), the suﬃcient condition be-

comes β(Γ

, q) ≤ 1, which is again trivially satisﬁed.

Finally, when R < α(G, w), the suﬃcient condition

becomes

≥ −

∗

(G, w) − α(G, w)

α(G, w) − R

Pr(Λ

ind

), (122)

which is again trivially satisﬁed since p

≥ 0.

Hence trivial POVMs cannot yield a violation of

our noncontextuality inequalities. This is the sense in

which trivial POVMs cannot lead to nonclassicality in

our approach, unlike the case of traditional Kochen-

Specker approaches [21–23, 25] applied to the case

of POVMs [73]. To violate our noncontextuality in-

equalities, the POVMs must necessarily have some

nontrivial projective component (that is not the iden-

tity operator or zero) but they need not be projec-

tors. Further, we do not rely on restricting the notion

of joint measurability [44] (cf. Section 2.4) to com-

mutativity for POVMs. Taking joint measurability

to be just commutativity is the approach adopted in,

for example, Ref. [25]. We refer to Appendix A and

Appendix C for more discussion on these issues, in

particular Appendix C for the role of commutativity

vs. joint measurability.

7 Conclusions

We have obtained a hypergraph framework for ob-

taining noise-robust noncontextuality inequalities cor-

responding to KS-colourable scenarios, suitably aug-

mented with preparation procedures in the spirit of

Spekkens contextuality [18]. The inequalities take the

form of a noncontextual tradeoﬀ between the three

operational quantities Corr, R, and p

, cf. Eq. (72).

This framework leverages the graph invariants from

the graph-theoretic framework of CSW for doing this,

in addition to a new hypergraph invariant (Eq. (45))

that we call the weighted max-predictability. Our ap-

proach is general enough to be applicable to any situ-

ation involving noisy preparations and measurements

that arises from a KS-colourable contextuality sce-

nario.

We conclude with a list of open questions raised in

this paper and other directions for future research:

1. Characterizing structural Specker’s principle

from probabilistic models on a hypergraph Γ:

Given that CE

(Γ) = G(Γ) for some Γ, is it

the case that Γ must then necessarily satisfy

structural Specker’s principle, namely, that ev-

ery clique in O(Γ) is a subset of some hyperedge

in Γ? Or is it the case that there exists a hyper-

graph Γ

for which CE

(Γ

) = G(Γ

) but struc-

tural Specker’s principle fails?

More generally, is there any characterization of a

hypergraph satisfying structural Specker’s prin-

ciple entirely in terms of the probabilistic models

on it?

As already pointed out earlier, this open question

relates to the open Problem 7.2.3 of Ref. [23] of

characterizing Γ for which CE

(Γ) = G(Γ). It

is known that Γ representing bipartite Bell sce-

narios [55] satisfy the property CE

(Γ) = G(Γ)

and we have provided a generic recipe for con-

verting any Γ that does not satisfy structural

Specker’s principle to a Γ

that does satisfy it

so that CE

(Γ

) = G(Γ

). The question is if there

are any other Γ that also satisfy CE

(Γ) = G(Γ).

2. Almost quantum theory: We know that an almost

quantum theory cannot satisfy Specker’s princi-

ple [49] but it satisﬁes statistical Specker’s princi-

ple (or consistent exclusivity). An open question

that remains is:

Can an almost quantum theory satisfy structural

Specker’s principle?

If not, this would render the satisfaction of con-

sistent exclusivity by an almost quantum theory

unexplained by a natural structural feature of

measurements in the theory, namely, the satisfac-

tion of structural Specker’s principle, i.e., almost

quantum theory would not fall in the category of

operational theories envisaged in Ref. [30].

3. Conditions for saturating the noise-robust non-

contextuality inequalities:

As mentioned in Section 6.2, it is an open ques-

tion whether the noise-robust noncontextuality

inequalities of Eq. (72) based on our generaliza-

tion of the CSW framework [22] can be saturated

by a noncontextual ontological model.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 32

More generally, the status of these noise-robust

noncontextuality inequalities vis-`a-vis the algo-

rithmic approach of Ref. [17] for ﬁnding necessary

and suﬃcient conditions for noncontextuality in

a general prepare-and-measure scenario remains

to be explored. One would suspect that the al-

gorithmic approach of Ref. [17] when adapted

to the kind of situation considered in this paper

would yield nontrivial noncontextuality inequali-

ties that aren’t merely generalizations of the ones

obtained in the CSW framework [22]. It would be

interesting to investigate the full structure of this

set of inequalities and compare it with the facet-

deﬁning Bell-KS inequalities of the CSW frame-

work.

4. Properties of the weighted max-predictability,

β(Γ

, q):

Since the crucial new hypergraph-theoretic ingre-

dient in our inequalities is the weighted max-

predictability, it would be interesting to under-

stand properties of this hypergraph invariant on

both counts: as a new mathematical object in

its own right, one we haven’t been able to ﬁnd

a reference to in the hypergraph theory litera-

ture, as well as an important parameter of a hy-

pergraph relevant for noise-robustness of a noise-

robust noncontextuality inequality. Indeed, as we

point out in Footnote 34, identifying a distribu-

tion q (in the deﬁnition of Corr, Eq. (53)) that

minimizes β(Γ

, q) for a given Γ

would lead

to better noise-robustness in the inequalities of

Eqs. (70) or (71).

5. Noise-robust applications of quantum protocols

based on KS-contextuality:

A general research direction is to construct noise-

robust versions of applications that have previ-

ously been suggested for KS-contextuality. Our

approach provides a recipe for doing this for

any Bell-KS inequality appearing in such applica-

tions. Besides serving as a witness for strong non-

classicality [56] (i.e., Spekkens contextuality),

noise-robust versions of these applications can

help benchmark the experiments in terms of the

noise that can be tolerated while still witnessing

nonclassicality. Examples of such applications in-

clude those from Refs. [58–63].

Acknowledgments

I would like to thank Andreas Winter for his com-

ments on an earlier version of some of these ideas, To-

bias Fritz for the ping-pong and the sing-song in which

As opposed to weak nonclassicality that can arise in epis-

temically restricted classical theories [57]. See also the talk at

Ref. [56], 41:43 minutes, for a short discussion.

we often talked about hypergraphs, Rob Spekkens

for the often argumentative – but always productive

– conversations over lunch, and participants at the

Contextuality conference (CCIOSA) at Perimeter In-

stitute, during July 24 - 28, 2017, for very stimulat-

ing discussions that fed into the narrative of this pa-

per. I would also like to thank David Schmid, Ana

Bel´en Sainz, Elie Wolfe, and Tom´aˇs Gonda for helping

me better articulate the diﬀerence between structural

vs. statistical readings of Specker’s principle, and Eric

Cavalcanti for comments on the manuscript. Theorem

1 owes its origin to a discussion with Tom´aˇs Gonda. I

would also like to thank anonymous referees for sug-

gestions that immensely improved the presentation

of these results. Research at Perimeter Institute is

supported by the Government of Canada through the

Department of Innovation, Science and Economic De-

velopment Canada, and by the Province of Ontario

through the Ministry of Research, Innovation and Sci-

ence.

A Status of KS-contextuality as an ex-

perimentally testable notion of nonclas-

sicality for POVMs in quantum theory

The purpose of this section is to emphasize how the

progression from KS-contextuality to Spekkens con-

textuality for KS-type contextuality experiments is a

natural one rather than an ad hoc move from one

framework to another. That is, Spekkens contex-

tuality is not just another notion of nonclassical-

ity that is incomparable with KS-contextuality, but

is indeed intimately connected in its motivations to

the limitations of KS-contextuality [18]. In partic-

ular, we will focus on the role of KS-contextuality

with respect to POVMs and why allowing arbitrary

POVMs poses a diﬃculty for KS-contextuality as

a notion of nonclassicality that is experimentally

testable, i.e., a notion that applies to noisy measure-

ments (POVMs) typically implemented in a labora-

tory experiment.

While one may be tempted to

reject this premise for assessing the suitability of KS-

contextuality as a notion of nonclassicality – claim-

ing instead that KS-contextuality was never meant

for POVMs and applies only to “puriﬁed” experi-

ments (namely, ones with only PVMs and pure states)

– the reasons for doing so are rooted in the litera-

ture on KS-contextuality where POVMs have indeed

been considered and (at least) two kinds of conclu-

sions drawn: one, that there exists a Kochen-Specker

contradiction for POVMs, even on a qubit, so KS-

contextuality for POVMs is interesting [64] and two,

that allowing arbitrary POVMs in assessing nonclas-

sicality would make the research program of identify-

And how a rather compelling way to arrive at a notion that

is experimentally testable is Spekkens contextuality.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 33

ing device-independent principles for quantum corre-

lations in KS-contextuality experiments ill-deﬁned, so

quantum correlations allowing arbitrary POVMs are

“pathological” [73]. We will look at these arguments

in turn and use the latter, in particular, to segue into

our motivations for the framework proposed in this

paper.

A.1 Limitations of KS-contextuality vis-

a-vis

POVMs

A.1.1 KS-contextuality for POVMs in the literature

The ﬁrst paper that applied KS-contextuality to the

case of POVMs was by Cabello [64] where a KS-

uncolourability argument for POVMs on a single

qubit was proposed. This was motivated by the

Gleason-type derivation of the Born rule starting with

the structure of POVMs due to Busch [66] and Caves

et al. [67], analogous to the case of the Kochen-

Specker theorem [19] which can be seen as motivated

by Gleason’s theorem [68]. Insofar as there exists a

Gleason-type theorem for POVMs [66, 67], one could

motivate KS-contextuality as a reasonable notion of

nonclassicality for POVMs, as was presumably the

case in Ref. [64]. The role of this notion of nonclas-

sicality is then just to argue – using a ﬁnite set of

POVM elements – that no KS-noncontextual assign-

ment of outcomes is possible for certain ﬁnite sets

of POVMs in quantum theory. Should we, however,

assume that it is reasonable to demand determinis-

tic assignment of outcomes to POVM elements in an

ontological model, just as we do for PVM elements?

The argument of Ref. [64] was later criticized on var-

ious counts [18, 28, 65] and we refer the reader to

Ref. [28] for criticisms pertinent to this paper, namely,

that outcome determinism for all unsharp measure-

ments (ODUM in Ref. [28]) in quantum theory is un-

tenable.

Other works in the literature where KS-

contextuality for POVMs has been explored include

Refs. [69–72].

Besides, doubts about the experimental testability

of the KS theorem were raised in the late ‘90s in a se-

ries of papers by Meyer, Clifton, and Kent [74–76]. A

review can be found in Ref. [77]. These doubts were

premised on the idea that the set of KS-colourable

projectors (or PVMs) on any given Hilbert space is

dense in the set of all projectors (or PVMs) on that

Hilbert space. That is, for any given set of PVMs

yielding a KS contradiction, it is always possible to

ﬁnd PVMs which are arbitrarily “close” to the PVMs

required for a KS contradiction (for any ﬁnite preci-

sion) but which do not themselves lead to a KS contra-

diction. The property of denseness of KS-colourable

Ref. [28] is also a good resource for a detailed analysis of

arguments concerning dilations of POVMs, which we will not

get into here. Besides, it also provides a principled recipe for

assigning response functions to POVMs.

sets of measurements in the set of all measurements

in fact extends to even the most general case when

the measurements are POVMs on any Hilbert space.

So, even a KS contradiction for POVMs (such as the

one in Ref. [64]) falls prey to the Meyer-Clifton-Kent

argument [77]. As Ref. [77] notes:

Dealing with projective measurements is

arguably not enough. One quite popular

view of quantum theory holds that a cor-

rect version of the measurement rules would

take POV measurements as fundamental,

with projective measurements either as spe-

cial cases or as idealisations which are never

precisely realised in practice. In order to de-

ﬁne an NCHV theory catering for this line of

thought, Kent constructed a KS-colourable

dense set of positive operators in a complex

Hilbert space of arbitrary dimension, with

the feature that it gives rise to a dense set of

POV decompositions of the identity (Kent,

1999). Clifton and Kent constructed a dense

set of positive operators in complex Hilbert

space of arbitrary dimension with the special

feature that no positive operator in the set

belongs to more than one decomposition of

the identity (Clifton & Kent, 2000). Again,

the resulting set of POV decompositions is

dense, and the special feature ensures that

one can average over hidden states to recover

quantum predictions.

Hence, in any ﬁnite precision experiment it would

be impossible to test the Kochen-Specker theorem,

i.e., such an experimental test would require an in-

ﬁnitely precise measurement and measurements in a

real-world laboratory are never inﬁnitely precise. Al-

though there was a lively debate along these lines

(see the references in [77]), the resolutions that were

proposed all involved modifying the notion of KS-

noncontextuality by adding auxiliary assumptions

that seek to exclude the Meyer-Clifton-Kent type ar-

guments. A recent attempt in this direction can be

found in Ref. [78] where a notion of “ontological faith-

fulness” is proposed. As such, it was already recog-

nized – for reasons independent of Spekkens contex-

tuality [18] – that the notion of KS-noncontextuality

needs to be revised if one is to make it experimen-

tally testable.

What Spekkens brought to the fore

[18], besides generalizing the notion of contextuality

to all experimental procedures rather than measure-

ments alone, was the idea that an experimental test of

noncontextuality should not rely on inequalities that

presume outcome determinism, just as a test of local

causality does not require the assumption of outcome

Of course, this takes nothing away from the importance of

the Kochen-Specker theorem [19] as a no-go theorem concerning

the logical structure of quantum theory and the constraints it

places on the ontological models possible for the theory.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 34

determinism. Indeed, the assumption of outcome de-

terminism for sharp measurements in quantum the-

ory is derived in the Spekkens framework from the as-

sumption of preparation noncontextuality rather than

being assumed independently.

We will now consider the more modern approach

to KS-contextuality along the lines of the frameworks

in Refs.[22, 23, 25] to segue into our framework for

Spekkens contextuality which we develop in this pa-

per.

A.1.2 Classifying probabilistic models: restriction of

quantum models to PVMs

Research on KS-contextuality took a diﬀerent turn

with the advent of the graph-theoretic framework of

Cabello, Severini and Winter in 2010 [21] (revised

slightly in 2014 [22]), the sheaf-theoretic framework of

Abramsky and Brandenburger in 2011 [25], and the

hypergraph based formalism of Ac`ın, Fritz, Leverrier,

and Sainz in 2012 [23]. The unifying theme of these

contributions was that they took the key mathemat-

ical idea underlying KS-noncontextuality and Bell-

locality — namely, that both are instances of the clas-

sical marginal problem [26, 32, 33] — and built frame-

works that sought to distinguish between classical the-

ories (namely, those admitting KS-noncontextual on-

tological models), quantum theory, and post-quantum

general probabilistic theories by classifying their em-

pirical predictions relative to a Kochen-Specker ex-

periment into these categories. All these frameworks,

motivated by the device-independence paradigm, es-

chewed the erstwhile restriction of the notion of KS-

noncontextuality to quantum theory and sought to

make their analysis theory-independent, relying only

on empirical predictions relative to a KS experiment

to classify theories. They separated the assumption

of KS-noncontextuality from the operational theory –

namely, quantum theory – to which it was originally

meant to apply, allowing arbitrary operational theo-

ries in their analysis. However, there was a key dis-

tinction between Bell scenarios and KS-contextuality

scenarios that was lost in this formal uniﬁcation:

namely, that while the deﬁnition of a quantum proba-

bilistic model in a Bell scenario need not be restricted

to (local) PVMs (and arbitrary local POVMs can be

allowed without changing the set of quantum mod-

els), the same is not true of a KS-contextuality sce-

nario. Indeed, as Henson and Sainz note in their work

[73],

reﬂecting on the question of allowing arbitrary

POVMs in the deﬁnition of a quantum probabilistic

model:

...if we allow general POVMs rather than

projective measurements then no principle

Proposing a principle bounding the KS-contextuality pos-

sible in quantum theory, namely, “Macroscopic Noncontextual-

ity”.

that places a non-trivial restriction on cor-

relations will be respected. Thus, this kind

of “quantum model” is clearly pathological.

One way to motivate the present work is as a re-

sponse to the pathology that Henson and Sainz allude

to: that trivial POVMs can realize any probabilistic

model, hence allowing arbitrary POVMs makes the

problem of ﬁnding principles to identify quantum cor-

relations in KS-contextuality scenarios trivial, i.e., all

probabilistic models are quantum and there is nothing

to be learnt about post-quantum probabilistic mod-

els. This is because any set of probabilities satisfying

the “no-disturbance” or “no-signalling” condition (of

which the E1 correlations of CSW [22] are a subset,

in general) can be achieved by (trivial) POVMs by

simply multiplying an identity operator with every

probability in such an assignment of probabilities.

By the lights of KS-noncontextuality as one’s notion

of classicality, then, trivial POVMs saturating the

general probabilistic bound on the correlations would

seem to be maximally nonclassical (i.e., maximally

KS-contextual). To avoid such “pathological” quan-

tum models, they restrict the deﬁnition of a quantum

model to allow only projective measurements. Indeed,

with recent work on a sensible notion of “sharp” mea-

surement in a general probabilistic theory [30, 31],

an appeal to the “fundamental sharpness” of all mea-

surements (see, e.g., [29]) is made to restrict attention

to sharp measurements in both quantum theory and

general probabilistic theories.

On the other hand, the approach in this paper is

diﬀerent. In particular, we want our approach to cap-

ture the intuition that trivial POVMs are “classical”

(and not pathological), so we must go beyond KS-

noncontextuality. A simple operational sense in which

trivial POVMs are “classical” is that they reveal noth-

ing about the quantum state on which they are mea-

sured, being incapable of distinguishing any pair of

states whatsoever.

The correlations (denoted by

R([s|S])) usually examined in a KS-contextuality ex-

periment do not allow such experiments to witness

the “triviality” of trivial POVMs, i.e., the fact that

they correspond to a ﬁxed probability distribution

that doesn’t vary even as the choice of preparation

is varied. Moreover, since all nonprojective mea-

surements are excluded by ﬁat in traditional Kochen-

Specker type approaches [22, 23] for reasons alluded

to by Henson and Sainz [73], one loses out on the po-

tential to explore the possibilities that nontrivial and

nonprojective measurements oﬀer with respect to con-

Trivial POVMs are, therefore, trivial resolutions of the

identity, where every POVM element is proportional to identity,

i.e., {aI}

, such that a ∈ [0, 1] and

a = 1.

Indeed, any trivial POVM can be realized in the follow-

ing operational manner: take the quantum system prepared in

some state, throw it in the garbage, and then sample from the

classical probability distribution corresponding to the trivial

POVM.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 35

textuality.

Our approach, therefore, is to allow arbi-

trary POVMs when considering probabilistic models

arising from quantum theory (and not restricting to

any notion of “sharp measurements” in general prob-

abilistic theories) but examine more quantities than

are examined in traditional approaches, i.e., besides

the quantity R typical in a KS-contextuality scenario,

we invoke the quantity Corr to account for noise in the

measurements.

If one restricts attention to operational theories

that can always achieve Corr = 1 for any KS-

contextuality scenario, then the usual classiﬁcation

of probabilistic models following Refs. [22, 23] holds

(Eq. (72)). What is of interest in our framework, how-

ever, is the tradeoﬀ between R and Corr: how large

can both R and Corr be in an operational theory?

(See Eq. (72).)

A.2 Robustness of Bell nonlocality vis-

a-vis

POVMs

Note that whenever we refer to “Bell-KS” functionals

or inequalities for Kochen-Specker type experiments,

we are not thinking of experiments that are Bell ex-

periments [4, 5, 7, 9–11], which have spacelike sepa-

ration between multiple parties, each performing lo-

cal measurements on a shared multipartite prepara-

tion. For the case of Bell experiments, trivial local

POVMs assigned to each party in a Bell experiment

do not lead to Bell violations for a simple reason: the

trivial POVMs for each party are all compatible with

each other, thereby admitting a joint probability dis-

tribution over their outcomes for each party; taking a

product of these local joint probability distributions

(one for each party) results in a joint distribution over

all measurements of all parties, hence satisfying Bell

inequalities. The fact that the POVMs are trivial

ensures that the Bell inequalities are satisﬁed regard-

less of the choice of shared quantum state. On the

other hand, forgetting the constraint of local POVMs,

there always exist global trivial POVMs that can vi-

olate Bell inequalities: e.g., just take the Popescu-

Rohrlich (PR) box distribution [43], and multiply an

identity operator (on the joint Hilbert space of Al-

ice and Bob) with each probability in the PR-box;

this results in four trivial POVMs, deﬁned over the

joint Hilbert space, that together violate the CHSH

inequality maximally. But, of course, this violation

is uninteresting because it doesn’t obey the locality

constraint on the measurements in a Bell experiment.

This is mathematically reﬂected in the fact that the

PR-box distribution cannot be written as a convex

mixture of product distributions, one for each party,

hence the corresponding trivial POVM cannot be un-

All trivial POVMs are nonprojective, but not all nonprojec-

tive POVMs are trivial. Indeed, see Refs. [39–42] for examples

of generalized contextuality [18] with nonprojective measure-

ments, albeit assuming operational quantum theory.

derstood in terms of trivial local POVMs. Hence, it is

the locality of the trivial POVMs in a Bell experiment

that prevents them from violating a Bell inequality

and renders them non-pathological, unlike in the case

of KS-contextuality. The fact that they are “trivial”

in the sense of being unable to distinguish two quan-

tum states plays a role in the sense that, regardless of

the shared quantum state, these POVMs yield ﬁxed

distributions over the measurement outcomes, thus

always allowing the construction of a ﬁxed (that is,

independent of the quantum state) global joint prob-

ability distribution over all measurements in a Bell

scenario. Since there are no such locality constraints

on the form of the POVM elements in a Kochen-

Specker experiment, they can easily violate any KS-

noncontextuality inequality, e.g., the two-party CHSH

experiment considered as a Kochen-Specker experi-

ment with four observables in a 4-cycle where ad-

jacent pairs are jointly measurable allows for trivial

POVMs (like the PR-box trivial POVM above) violat-

ing the CHSH-type Bell-KS inequality in this scenario

maximally. By the lights of KS-noncontextuality, this

violation would indicate the maximum possible KS-

contextuality with respect to this CHSH-type inequal-

ity.

For all these reasons, our discussion of KS-

noncontextuality as a notion of classicality — in an

experiment with no locality constraints on the mea-

surements — does not extend to the case of Bell-

locality (or local causality) as a notion of classicality

in a Bell experiment, where the experiment must re-

spect locality constraints on the measurements for a

Bell inequality violation to be meaningful.

The uniﬁcation of Bell nonlocality and KS-

contextuality `a la Refs. [22, 23, 25] forces a certain

dichotomy in these approaches: while in Bell scenar-

ios, one need not restrict to any notion of a “sharp”

measurement in the deﬁnition of probabilistic models

(and thus claim “theory independence”), in Kochen-

Specker scenarios, one must make some statement

about the nature of the measurements (concerning

their presumed sharpness [29], or that their joint mea-

surability [42, 44] is restricted to commutativity [25]),

rendering any putative “theory independence” claim

(on a level at par with Bell nonlocality) unfounded.

See Appendix C for more discussion.

See Ref. [27] for how this lack of locality of measurements in

a Kochen-Specker type experiment translates, at the ontolog-

ical level, to the unreasonableness of assuming factorizability

in the ontological model; this factorizability (or the stronger

condition of outcome determinism) is invoked to justify the re-

sulting derivation of Bell-KS inequalities as constraints from a

classical marginal problem.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 36

B Ontological models without respect-

ing coarse-graining relations

Here we will construct explicit examples where the

coarse-graining relations are not respected in an on-

tological model, in contrast to the requirement on the

representation of coarse-grainings that we invoked in

Section II.C of the main text. The goal is to empha-

size that the requirement of Section II.C is necessary

not only for the treatment of Spekkens contextuality

but also for Kochen-Specker contextuality. Below, we

ﬁrst demonstrate how a “KS-noncontextual” model

can be constructed for any scenario that proves the

KS theorem by using the example of the KCBS setup

[47]. We then proceed to demonstrate how a “prepa-

ration and measurement noncontextual” model can be

constructed in a similar way whenc considering gen-

eralized noncontextuality [18].

B.1 How to construct a “KS-noncontextual”

ontological model of the KCBS experiment [47]

without coarse-graining relations

Here we have that M contains at least the follow-

ing measurement settings: {M

}

i=1

, each with three

possible outcomes, m

∈ {0, 1, 2}. The measurement

events for each measurement setting M

can be coarse-

grained into two diﬀerent ways, deﬁning new measure-

ment settings M

(with outcomes m

∈ {0,

0}) and

(with outcomes m

∈ {2,

2}), where the coarse-

graining relations are given by

[0|M

] ≡ [0|M

], (123)

[

0|M

] ≡ [1|M

] + [2|M

], (124)

[2|M

] ≡ [2|M

], (125)

[

2|M

] ≡ [0|M

] + [1|M

]. (126)

In the operational theory, these coarse-graining rela-

tions are respected, i.e., for all [s|S], s ∈ V

, S ∈ S,

p(0, s|M

, S) ≡ p(0, s|M

, S), (127)

0, s|M

, S) ≡ p(1, s|M

, S) + p(2, s|M

, S), (128)

p(2, s|M

, S) ≡ p(2, s|M

, S), (129)

2, s|M

, S) ≡ p(0, s|M

, S) + p(1, s|M

, S). (130)

However, we do not require that these relations be

respected in an ontological model. Now, the KCBS

argument requires the following operational equiva-

lences,

[2|M

] ' [0|M

i+1

], (131)

for all i ∈ {1, 2, 3, 4, 5}, where addition is modulo 5,

so that i + 1 = 1 for i = 5. A KS-noncontextual

ontological model for this experiment requires that

ξ(2|M

, λ) = ξ(0|M

i+1

, λ) ∈ {0, 1}, ∀λ ∈ Λ. (132)

Constructing such a model requires one to spec-

ify response functions for the measurements

, M

}

i=1

. However, since there are no

constraints from coarse-graining relations on these

response functions, there is no obstruction to

the construction of a “KS-noncontextual model”

of this type for any set of operational statis-

tics. In particular, since we do not require that

∀λ ∈ Λ : ξ(0|M

i+1

, λ) ≡ ξ(0|M

i+1

, λ), nor that

∀λ ∈ Λ : ξ(2|M

, λ) ≡ ξ(2|M

, λ), we can assign

arbitrary response functions to {M

, M

}

i=1

, subject

only to the condition from KS-noncontextuality that

∀λ ∈ Λ : ξ(2|M

, λ) = ξ(0|M

i+1

, λ) ∈ {0, 1}.

Note that, because coarse-graining relations

are not respected, this does not imply that

∀λ ∈ Λ : ξ(2|M

, λ) = ξ(0|M

i+1

, λ) ∈ {0, 1}, which is

the usual constraint we would have presumed from

KS-noncontextuality when coarse-graining relations

are respected in the ontological model. In the absence

of any such constraints on the response functions for

}

i=1

, one can always reproduce their operational

statistics, in particular the operational equivalences

of the type [2|M

] ' [0|M

i+1

], which follow from

Eqs. (123),(125), and (131).

B.2 How to construct a “preparation and

measurement noncontextual” ontological model

without coarse-graining relations

Just as for measurements in the case of KS-

noncontextuality, abandoning the coarse-graining re-

lations for preparations in the case of generalized non-

contextuality [18] makes possible the existence of a

“preparation and measurement noncontextual” on-

tological model for any set of operational statistics.

For the kinds of proofs of contextuality relevant to

this article, the relevant notion of coarse-graining is

that of complete coarse-graining: that is, consider

two source settings S and S

with (respective) source

events {[s|S]}

s∈V

and {[s

]}

∈S

, that can be com-

pletely coarse-grained to yield the operational equiva-

lence [>|S

] ' [>|S

], cf. Eq. (18). In the operational

description, where we assume the coarse-graining re-

lation is respected, this is represented by

∀[m|M], m ∈ V

, M ∈ M :

p(m, s|M, S) =

p(m, s

|M, S

). (133)

In the ontological description, however, we do not

impose the coarse-graining relations µ(λ, >|S

) ≡

µ(λ, s|S) and µ(λ, >|S

) ≡

µ(λ, s

), which

makes it trivial to write down probability dis-

tributions µ(λ, >|S

) and µ(λ, >|S

) such that

µ(λ, >|S

) = µ(λ, >|S

) (as required by prepara-

tion noncontextuality applied to [>|S

] ' [>|S

])

but where we do not require that

µ(λ, s|S) =

This “KS-noncontextual” ontological model will thus repro-

duce operational equivalences of the type [2|M

] ' [0|M

i+1

]

(cf. Eq. (131)).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 37

µ(λ, s|S) (which is not required by preparation

noncontextuality). Note how the refusal to re-

spect the coarse-graining relations, i.e., identifying

µ(λ, >|S

) with

µ(λ, s|S) and µ(λ, >|S

) with

µ(λ, s

), lifts the constraint from preparation

noncontextuality that would have been in place if the

coarse-graining relations were respected. The same

refusal for the case of measurements lifts any con-

straints (just as in the case of KS-noncontextuality

above) from measurement noncontextuality on the on-

tological model. It thus becomes trivial to construct

a “preparation and measurement noncontextual” on-

tological model without coarse-graining relations.

C Trivial POVMs

C.1 Bell-CHSH scenario

We have the Hilbert space H

⊗ H

for Alice

) and Bob (H

). Consider four binary-outcome

POVMs, {A

(0)

, A

(1)

, B

(0)

, B

(1)

}, where

(0)

≡ {A

(0)

, A

(0)

(1)

≡ {A

(1)

, A

(1)

(0)

≡ {B

(0)

, B

(0)

≡ {B

(1)

, B

(1)

}, (134)

0 ≤ A

(0)

, A

(1)

≤ I

, 0 ≤ B

(0)

, B

(1)

≤ I

, A

(0)

= A

(1)

+ A

(1)

= I

, and B

(0)

+ B

(0)

= B

(1)

= I

. The quantum probability, given a shared

quantum state ρ

deﬁned on H

⊗ H

, is given by

p(a, b|x, y) = Tr(ρ

(x)

⊗ B

(y)

), (135)

for a, b, x, y ∈ {0, 1}. Here A

(x)

⊗ I

is jointly mea-

surable with I

⊗B

(y)

, just because of the commuta-

tivity of their respective POVM elements. The joint

observable being measured is A

(x)

⊗ B

(y)

. Now, con-

sider the case when all the POVM elements are triv-

ial, i.e., A

(x)

= q

(x)

and B

(y)

= r

(y)

, for some

(x)

, r

(y)

∈ [0, 1] for all a, b, x, y ∈ {0, 1}. We then

have

p(a, b|x, y) = q

(x)

(y)

, ∀a, b, x, y ∈ {0, 1}. (136)

A global joint probability distribution which repro-

duces the above as marginals is simply given by their

product:

p(a

(0)

, a

(1)

, b

(0)

, b

(1)

) ≡ q

(0)

(1)

(0)

(1)

. (137)

Hence, trivial POVMs never violate any Bell-CHSH

inequality for this scenario.

C.2 CHSH-type contextuality scenario: 4-

cycle

We now consider the Bell-CHSH scenario without the

constraint of spacelike separation. What the lack of

spacelike separation means from the quantum per-

spective is that one no longer needs to model this

spacelike separation by requiring a tensor product

structure, or (more generally) by requiring the com-

mutativity of the observables that are jointly mea-

sured [25, 79, 80]. That is, there is no physical justi-

ﬁcation for imposing the tensor product structure or

the commutativity of jointly measured observables.

Thus, we have the Hilbert space H and we consider

four binary-outcome POVMs, {A

(0)

, A

(1)

, B

(0)

, B

(1)

on H, where

(0)

≡ {A

(0)

, A

(0)

(1)

≡ {A

(1)

, A

(1)

(0)

≡ {B

(0)

, B

(0)

≡ {B

(1)

, B

(1)

}, (138)

0 ≤ A

(0)

, A

(1)

, B

(0)

, B

(1)

≤ I

, A

(0)

+ A

(0)

(1)

+ A

(1)

= B

(0)

+ B

(0)

= B

(1)

+ B

(1)

= I

Further, the following sets of POVMs are jointly

measurable: {A

(0)

, B

(0)

}, {A

(0)

, B

(1)

}, {A

(1)

, B

(0)

(1)

, B

(1)

}. The most general joint observable for a

pair of compatible POVMs {A

(x)

, B

(y)

} is given by

a POVM G

(xy)

≡ {G

(xy)

, G

(xy)

, G

(xy)

, G

(xy)

} (that

isn’t necessarily unique [42]) such that: G

(xy)

= A

(x)

, G

(xy)

+ G

(xy)

= A

(x)

, G

(xy)

+ G

(xy)

(y)

, G

(xy)

+ G

(xy)

= B

(y)

. In particular, if (and

only if) the POVMs A

(x)

and B

(y)

commute, we can

construct the joint POVM as a product: G

(xy)

(x)

(y)

for all a, b, x, y ∈ {0, 1}. In the absence of

such commutativity, the joint POVM cannot be writ-

ten as a product.

The quantum probability, given a quantum state ρ

on H, is given by

p(a, b|x, y) = Tr(ρG

(xy)

), (139)

for a, b, x, y ∈ {0, 1}. Note that this probability de-

pends on the joint measurement G

(xy)

implementing

(x)

and B

(y)

together, and that, in general, there

may be multiple choices of G

(xy)

possible. This is

easy to see since there is one undetermined positive

operator in the joint measurement that is not ﬁxed by

(x)

or B

(y)

, i.e., we can write the POVM elements of

(xy)

as: G

(xy)

= A

(x)

−G

(xy)

, G

(xy)

= B

(y)

−G

(xy)

On the other hand, what this lack of spacelike separation

means from the perspective of an ontological model is that one

no longer has a justiﬁcation for assuming factorizability [25]

and, consequently, the generalization of Fine’s theorem [26] fails

to prove that there is no loss of generality in assuming outcome

determinism in discussions of KS-contextuality (unlike the case

of Bell scenarios, where factorizability is justiﬁed by spacelike

separation); there is a deﬁnite loss of generality, in that mea-

surement noncontextual and outcome-indeterministic ontologi-

cal models that are non-factorizable are not empirically equiva-

lent to measurement noncontextual and outcome-deterministic

(or KS-noncontextual) ontological models. See Ref. [27] for a

discussion of this aspect.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 38

(xy)

= I −A

(x)

−B

(y)

+ G

(xy)

, where G

(xy)

is a posi-

tive semideﬁnite operator satisfying A

(x)

(y)

−I

≤

(xy)

≤ A

(x)

, B

(y)

. Here G

(xy)

represents the freedom

in the choice of how the joint measurement might be

implemented within quantum theory. This freedom

reﬂects the fact that since the jointly measured ob-

servables are no longer spacelike separated, it is pos-

sible to introduce correlations between them that are

stronger than what is allowed in the corresponding

Bell scenario in quantum theory. The strength of

these correlations is only limited by the constraints on

(xy)

imposed by the marginal observables A

(x)

and

(y)

. This is in contrast to the case where A

(x)

and

(y)

are spacelike separated observables and the only

choice of joint POVM consistent with spacelike sepa-

ration is ﬁxed by G

(xy)

= A

(x)

(y)

, i.e., the strength

of correlations between A

(x)

and B

(y)

is ﬁxed entirely

by them and there is no freedom in choosing G

(xy)

Thus, we have that A

(x)

is jointly measurable with

(y)

and G

(xy)

denotes a joint POVM of A

(x)

and

(y)

. Now, consider the case when all the POVM

elements are trivial, i.e., A

(x)

= q

(x)

and B

(y)

, for some q

(x)

, r

(y)

∈ [0, 1] for all a, b, x, y ∈

{0, 1}.

In particular, consider the case where q

(x)

= r

(y)

for all a, b, x, y ∈ {0, 1}. A possible joint POVM for

these trivial POVMs is then the product POVM:

(xy)

= A

(x)

(y)

. (140)

If one restricted joint measurability of A

(x)

and B

(y)

to just commutativity — a suﬃcient but not necessary

condition for joint measurability

[44] — we would

take the above choice of the product POVM as a “nat-

ural” one. Being a product of trivial POVMs, this

choice will never lead to a violation of the CHSH-

type inequality for this scenario. Indeed, the struc-

ture of a Bell scenario — requiring the decomposi-

tion of the Hilbert space as H = H

⊗ H

(tensor

product paradigm), or more generally, imposing the

commutativity requirement [A

(x)

, B

] = 0 (commu-

tativity paradigm) — is such that the only possible

choice of joint measurement that can be implemented

by spacelike separated parties is the one that cor-

responds to the product POVM, given by operators

(xy)

= A

(x)

(y)

However, this is not the only allowed joint mea-

surement for these trivial POVMs, particularly when

there is no locality constraint on the measurements

from spacelike separation.

An extreme choice of

Particularly in the absence of spacelike separation. It is the

need to model spacelike separation in a quantum Bell exper-

iment that makes commutativity a necessary (and suﬃcient)

condition for joint measurability of spacelike separated observ-

ables in a Bell scenario

To incorporate such a constraint, spacelike separation

needs to be modelled via either the tensor product paradigm

joint POVM is the following:

P R(xy)

a⊕b,xy

, (141)

which leads to the probability distribution

p(a, b|x, y) =

a⊕b,xy

for any choice of quan-

tum state. Hence, this joint POVM G

P R(xy)

always

yields statistics corresponding to the PR-box, max-

imally violating the CHSH-type inequality for this

scenario, namely,

a,b,x,y



a⊕b=xy

p(a, b|x, y) ≤

. (142)

Physically, it’s possible to implement this (without

requiring any quantum resources) by providing a box

that always produces these correlations between mea-

surement settings denoted by (xy) ∈ {0, 1}

, regard-

less of the input state. Such a black-box would maxi-

mally violate the CHSH-type inequality (viewed as a

Bell-KS inequality witnessing KS-contextuality), but

that shouldn’t be surprising in the absence of space-

like separation. Also, the trivial PR-box joint POVM

P R(xy)

is a perfectly valid way to implement the

joint measurement of trivial POVMs A

(x)

and B

(y)

within the standard paradigm of operational quan-

tum theory.

To summarize, we note the following:

• Within the traditional framework of KS-

noncontextuality, if one wants to go beyond pro-

jective measurements to arbitrary POVMs in a

contextuality scenario, then one must – in order

to avoid the pathology of trivial POVMs violat-

ing the Bell-KS inequalities maximally – restrict

by ﬁat the notion of joint measurability to merely

commutativity. This is, for example, the attitude

adopted in Ref. [25].

or the commutativity paradigm. Both these ways of modelling

spacelike separation lead to the same set of quantum corre-

lations for any ﬁnite-dimensional Hilbert space H [79]. The

question of whether the two paradigms lead to the same set of

correlations in the case of inﬁnite dimensional Hilbert spaces

is the subject of Tsirelson’s problem [79, 80]. Most studies

of Bell-nonlocality are primarily concerned with ﬁnite dimen-

sional Hilbert spaces; should one encounter inﬁnite dimensional

Hilbert spaces, the commutativity paradigm is the proper way

to model spacelike separation.

Note that the point of this demonstration is to show how,

in the absence of spacelike separation justifying commutativ-

ity or a promise that the measurements are sharp, arbitrary

correlations are achievable in quantum theory if unsharp mea-

surements are allowed. All trivial POVMs are unsharp, but

the converse is not true. That is, one can consider nontriv-

ial POVMs that don’t violate the CHSH-type inequality maxi-

mally, but which violate it (arbitrarily) more than is allowed by

sharp measurements in quantum theory. One could construct

them, for example, by just taking a convex combination of the

PR-box trivial POVM with some sharp (and thus product) joint

POVM.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 39

• However, if one is going beyond projective mea-

surements, we know that commutativity is only

a suﬃcient condition for joint measurability, not

a necessary one [44].

• This brings us to our observation that the tra-

ditional notion of KS-noncontextuality is patho-

logical once the most general situation in quan-

tum theory is considered: arbitrary POVMs with

the general notion of joint measurability (see,

e.g., Ref. [44] for this notion and its relation to

commutativity). In particular, in the absence of

spacelike separation, there is no physical justiﬁ-

cation to restrict the notion of joint measurability

to merely commutativity.

• A similar consideration applies at the level of

a KS-noncontextual ontological model: there,

factorizability is not justiﬁed in the absence of

spacelike separation. So, on those grounds alone,

one should go beyond KS-noncontextuality as

one’s notion of classicality; particularly, if one

wants a notion of classicality that does not pre-

sume outcome determinism, just as local causality

doesn’t presume it. This was argued in Ref. [27]:

imagine an adversarial setting where because of

the absence of spacelike separation in a KS-

contextuality experiment, two measurement set-

tings on the same system can exhibit correla-

tions that are independent of those induced by

the system on which the measurements are be-

ing implemented, thus allowing them to exhibit

stronger correlations than are possible in a KS-

noncontextual model. We use trivial POVMs

only to drive home that this can be done arbi-

trarily well (achieving PR-box type correlations,

in fact) if there is no constraint on the strength

of correlations the measurement settings can ex-

hibit. The way such constraints on the corre-

lations between the measurement settings show

up in our analysis within the Spekkens frame-

work is in terms of the quantity Corr: if Corr

is really high, the measurements in a noncon-

textual ontological model cannot be arbitrarily

strongly correlated, i.e., R cannot be arbitrarily

high (cf. Eq. (72)).

D The KS-uncolourable hypergraph

It is instructive to consider the KS-uncolourable hy-

pergraph Γ

, originally appearing in Ref. [51], and

studied in the light of Spekkens contextuality in

Ref. [12]. This hypergraph fails both criteria for

the hypergraphs Γ considered in this paper, namely,

C(Γ) 6= ∅ (KS-colourability) and CE

(Γ) = G(Γ).

For probabilistic models on Γ

, the following hold:

C(Γ

) = ∅ ( CE

(Γ

) ( G(Γ

). This was con-

Figure 8: The hypergraph Γ

and its subhypergraphs, i.e.,

and Γ

, appearing in the three Bell-KS expressions of

Eq. (143). The probabilistic model p considered in Eq. (143)

is a probabilistic model on Γ

, and not on the subhyper-

graphs. We have illustrated the subhypergraphs separately

only for clarity regarding the subsets of vertices to which the

Bell-KS expressions refer: the probabilities assigned to these

vertices are obtained from probabilistic models on Γ

sidered in Ref. [12], where CE

(Γ

) excludes the

extremal probabilistic model in G(Γ

) that corre-

sponds to the upper bound on the noise-robust non-

contextuality inequality of Ref. [12]. As argued in

Ref. [12], this noise-robust noncontextuality inequal-

ity is the appropriate operational generalization (to

possibly noisy measurements) of the Kochen-Specker

contradiction ﬁrst demonstrated in Ref. [51]; this gen-

eralization cannot be accommodated in our general-

ization of the CSW framework [22].

If one extends the KS-uncolourable Γ

to a KS-

colourable hypergraph Γ

with 9 “no-detection”

events, one for each hyperedge, then we have C(Γ

) 6=

∅, but it’s still the case that C(Γ

) ( CE

(Γ

) (

G(Γ

) for this hypergraph.

Hence, Γ

cannot be

understood in our generalization of the CSW frame-

work either.

Indeed, if one “blindly” writes down a CSW clas-

sical bound for some Bell-KS expression deﬁned on

This follows from noting that extremal probabilistic models

on Γ

are still extremal probabilistic models on Γ

: ones

where the no-detection events are assigned zero probabilities.

See Theorem 2.5.3 of Ref. [23].

Note that adding these no-detection events is equivalent

to allowing subnormalized probabilities (i.e., sum of probabili-

ties assigned to measurement events in a hyperedge can be less

than 1) on Γ

. Hence, even allowing for subnormalization on

, which means that one is looking at probabilistic models on

the hypergraph Γ

, does not eliminate the gap between CE

probabilistic models and general probabilistic models, so that

any upper bound on a Bell-KS expression given by probabilis-

tic models in CE

(Γ

) is not always the same as the general

probabilistic upper bound from probabilistic models in G(Γ

The CSW framework only considers the upper bound given by

(Γ

) probabilistic models.

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 40

Figure 9: Going from the orthogonality graph, G, of Γ

the hypergraph Γ

(on the right) to which our noise-robust

noncontextuality inequality pertains.

O(Γ

), then such a bound is equivalently a bound

for the same Bell-KS expression deﬁned on Γ

(where

normalization is restored). Further, the E1 bound on

is a CE

bound on Γ

. The GPT bound happens

to agree with the CE

bound for a particular Bell-KS

expression (sum of all probabilities) but diﬀers for

some other Bell-KS expressions deﬁned on this hy-

pergraph. Consider, for example, the following three

expressions (see Fig. 8):

Expr

≡

v∈V (Γ

)

p(v),

Expr

≡

v∈V (Γ

)

p(v),

Expr

≡

v∈V (Γ

)

p(v) +

v∈V (Γ

)

p(v). (143)

We have:

Expr

C(Γ

)

≤ 8

(Γ

)

< 9

G(Γ

)

= 9,

Expr

C(Γ

)

≤ 1

(Γ

)

= 1

G(Γ

)

Expr

C(Γ

)

≤ 9

(Γ

)

< 10

G(Γ

)

< 10.5. (144)

Thus, Expr

is a Bell-KS expression that discrimi-

nates between probabilistic models at all three levels

of the hierarchy. Indeed, the upper bound on Expr

for CE

(Γ

) models can be saturated by projective

quantum realizations of the hypergraph, in particular

the standard realization with 18 rays, with the zero

operator for the no-detection events [51]. The fact

that there exists such a Bell-KS expression as Expr

means that the CE

upper bounds from the CSW

approach can be violated by a general probabilistic

model, i.e., the upper bounds for CE

models and

general probabilistic models don’t agree, and we can-

not take the graph-theoretic upper bounds of CSW for

granted in our noise-robust noncontextuality inequal-

ities. Indeed, the general probabilistic upper bound

for any Bell-KS expression deﬁned on a contextual-

ity scenario is a hypergraph invariant — in the sense

that it is a property that is shared by all hypergraphs

isomorphic to each other — that may or may not be

expressible as a graph invariant `a la CSW.

What, then, do the bounds given by graph invari-

ants of CSW for O(Γ

) mean in our generalization of

the CSW framework? Following our approach, out-

lined in Sec. III.B, we can go from G = O(Γ

) to the

hypergraph Γ

= Γ

O(Γ

)

(see Fig. 9) for which we

have (by construction) C(Γ

O(Γ

)

) 6= ∅ (so that the

underlying hypergraph is no longer KS-uncolourable)

and CE

(Γ

O(Γ

)

) = G(Γ

O(Γ

)

) (so that, for any Bell-

KS expression, the upper bound given by the frac-

tional packing number α

∗

(G, w) in the CSW frame-

work agrees with the general probabilistic upper

bound). Since this construction proceeds by con-

verting all maximal cliques in Γ

to hyperedges in

O(Γ

)

and adding a new vertex to each such hy-

peredge, it achieves both purposes: ﬁrstly, adding a

(no-detection) vertex to every maximal clique that is

a hyperedge in Γ

ensures the KS-colourability of

O(Γ

)

, i.e., C(Γ

O(Γ

)

) 6= ∅, and secondly, adding a

vertex to every maximal clique that is not a hyperedge

in Γ

ensures that CE

(Γ

O(Γ

)

) = G(Γ

O(Γ

)

). Once

these two properties are satisﬁed, the graph invari-

ants of CSW [22] become applicable to any Bell-KS

expression deﬁned for any set of vertices in the sub-

hypergraph Γ

of Γ

O(Γ

)

Our noise-robust noncontextuality inequality then

applies to the KS-colourable hypergraph Γ

O(Γ

)

where the graph invariants of CSW make sense, rather

than the KS-uncolourable hypergraph Γ

. On the

other hand, an appropriate noise-robust noncontex-

tuality inequality for the KS-uncolourable hypergraph

is, then, the one reported in Ref. [12].

References

[1] L. Hardy, “Quantum Theory From Five Reason-

able Axioms”, arXiv:quant-ph/0101012 (2001).

[2] L. Masanes and M. P. Mueller, “A derivation

of quantum theory from physical requirements”,

New J. Phys. 13, 063001 (2011).

[3] G. Chiribella, G. M. D’Ariano, and P. Perinotti,

“Probabilistic theories with puriﬁcation”, Phys.

Rev. A 81, 062348 (2010).

[4] J. S. Bell, “On the Einstein-Podolsky-Rosen para-

dox”, Physics 1, 195 (1964). Reprinted in Ref. [6],

Chapter 2.

The approach for KS-uncolourable hypergraphs will be fur-

ther developed in hypergraph-theoretic terms in forthcoming

work [34].

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 41

[5] J. S. Bell, “On the problem of hidden variables

in quantum mechanics”, Rev. Mod. Phys. 38, 447

(1966). Reprinted in Ref. [6], Chapter 1.

[6] J. S. Bell, “Speakable and Unspeakable in Quan-

tum Mechanics”, 2nd Edition, Cambridge Univer-

sity Press, 2004.

[7] J. F. Clauser, M. A. Horne, A. Shimony, and

R. A. Holt, “Proposed Experiment to Test Local

Hidden-Variable Theories”, Phys. Rev. Lett. 23,

880 (1969).

[8] N. Brunner, D. Cavalcanti, S. Pironio, V. Scarani,

and S. Wehner, “Bell nonlocality”, Rev. Mod.

Phys. 86, 419 (2014).

[9] B. Hensen et al., “Loophole-free Bell inequality vi-

olation using electron spins separated by 1.3 kilo-

metres”, Nature 526, 682 - 686 (2015).

[10] Lynden K. Shalm et al., “Strong Loophole-Free

Test of Local Realism”, Phys. Rev. Lett. 115,

250402 (2015).

[11] M. Giustina et al., “Signiﬁcant-Loophole-Free

Test of Bell’s Theorem with Entangled Photons”,

Phys. Rev. Lett. 115, 250401 (2015).

[12] R. Kunjwal and R. W. Spekkens, “From the

Kochen-Specker Theorem to Noncontextuality In-

equalities without Assuming Determinism”, Phys.

Rev. Lett. 115, 110403 (2015).

[13] M. D. Mazurek, M. F. Pusey, R. Kunjwal, K.

J. Resch, R. W. Spekkens, “An experimental test

of noncontextuality without unphysical idealiza-

tions”, Nat. Commun. 7, 11780 (2016).

[14] A. Krishna, R. W. Spekkens, and E. Wolfe, “De-

riving robust noncontextuality inequalities from

algebraic proofs of the Kochen-Specker theorem:

the Peres-Mermin square”, New J. Phys 19,

123031 (2017).

[15] D. Schmid and R. W. Spekkens, “Contextual Ad-

vantage for State Discrimination”, Phys. Rev. X

8, 011015 (2018).

[16] R. Kunjwal and R. W. Spekkens, “From sta-

tistical proofs of the Kochen-Specker theorem

to noise-robust noncontextuality inequalities”,

Phys. Rev. A 97, 052110 (2018).

[17] D. Schmid, R. W. Spekkens, and E. Wolfe,

“All the noncontextuality inequalities for arbi-

trary prepare-and-measure experiments with re-

spect to any ﬁxed set of operational equivalences”,

Phys. Rev. A 97, 062103 (2018).

[18] R. W. Spekkens, “Contextuality for prepara-

tions, transformations, and unsharp measure-

ments”, Phys. Rev. A 71, 052108 (2005).

[19] S. Kochen and E. P. Specker, “The Problem

of Hidden Variables in Quantum Mechanics”, J.

Math. Mech. 17, 59 (1967). Also available at JS-

TOR.

[20] N. Harrigan and R. W. Spekkens,“Einstein, In-

completeness, and the Epistemic View of Quan-

tum States,” Found. Phys. 40, 125 (2010).

[21] A. Cabello, S. Severini, and A. Winter, “(Non-

)Contextuality of Physical Theories as an Axiom”,

arXiv:1010.2163 [quant-ph] (2010).

[22] A. Cabello, S. Severini, and A. Winter, “Graph-

Theoretic Approach to Quantum Correlations”,

Phys. Rev. Lett. 112, 040401 (2014).

[23] A. Ac´ın, T. Fritz, A. Leverrier, and A. B. Sainz,

A Combinatorial Approach to Nonlocality and

Contextuality, Comm. Math. Phys. 334(2), 533-

628 (2015).

[24] J. Barrett, “Information processing in general-

ized probabilistic theories”, Phys. Rev. A 75,

032304 (2007).

[25] S. Abramsky and A. Brandenburger, “The sheaf-

theoretic structure of non-locality and contextual-

ity”, New J. Phys. 13, 113036 (2011).

[26] A. Fine, “Hidden Variables, Joint Probability,

and the Bell Inequalities”, Phys. Rev. Lett. 48,

291 (1982).

[27] R. Kunjwal, “Fine’s theorem, noncontextuality,

and correlations in Specker’s scenario”, Phys. Rev.

A 91, 022108 (2015).

[28] R. W. Spekkens, “The Status of Determinism

in Proofs of the Impossibility of a Noncontextual

Model of Quantum Theory”, Found. Phys. 44,

1125-1155 (2014).

[29] A. Cabello, “What do we learn about quantum

theory from Kochen-Specker quantum contextual-

ity?”, PIRSA:17070034 (2017).

[30] G. Chiribella and X. Yuan, “Measurement sharp-

ness cuts nonlocality and contextuality in ev-

ery physical theory”, arXiv:1404.3348 [quant-ph]

(2014).

[31] G. Chiribella and X. Yuan, “Bridging the gap

between general probabilistic theories and the

device-independent framework for nonlocality and

contextuality”, Information and Computation,

250, 15-49 (2016).

[32] R. Chaves and T. Fritz, “Entropic approach to

local realism and noncontextuality”, Phys. Rev. A

85, 032113 (2012).

[33] Tobias Fritz and Rafael Chaves, “Entropic In-

equalities and Marginal Problems”, IEEE Trans.

on Information Theory, vol. 59, pages 803 - 817

(2013).

[34] R. Kunjwal, “Hypergraph framework for irre-

ducible noncontextuality inequalities from log-

ical proofs of the Kochen-Specker theorem”,

arXiv:1805.02083 [quant-ph] (2018).

[35] A. Cabello, “Specker’s fundamental principle of

quantum mechanics”, arXiv:1212.1756 [quant-ph]

(2012).

[36] R. W. Spekkens, “Noncontextuality: how we

should deﬁne it, why it is natural, and what to

do about its failure”, PIRSA:17070035 (2017).

[37] M. D. Mazurek, M. F. Pusey, K. J. Resch,

and R. W. Spekkens, “Experimentally bound-

ing deviations from quantum theory in the

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 42

landscape of generalized probabilistic theories”,

arXiv:1710.05948 [quant-ph] (2017).

[38] M. F. Pusey, L. del Rio, and B. Meyer, “Contex-

tuality without access to a tomographically com-

plete set”, arXiv:1904.08699 (2019).

[39] Y. C. Liang, R. W. Spekkens, H. M. Wiseman,

“Specker’s parable of the overprotective seer: A

road to contextuality, nonlocality and complemen-

tarity”, Phys. Rep. 506, 1 (2011).

[40] R. Kunjwal and S. Ghosh, “Minimal state-

dependent proof of measurement contextuality for

a qubit”, Phys. Rev. A 89, 042118 (2014).

[41] R. Kunjwal, C. Heunen, and T. Fritz, “Quantum

realization of arbitrary joint measurability struc-

tures”, Phys. Rev. A 89, 052126 (2014).

[42] R. Kunjwal, “A note on the joint measurabil-

ity of POVMs and its implications for contextual-

ity”,arXiv:1403.0470 [quant-ph] (2014).

[43] S. Popescu and D. Rohrlich, “Quantum nonlocal-

ity as an axiom”, Found. Phys. 24, 379-385 (1994).

[44] T. Heinosaari, D. Reitzner, and P. Stano, “Notes

on Joint Measurability of Quantum Observables”,

Found. Phys. 38, 1133-1147 (2008).

[45] R. Kunjwal, “How to go from the KS theorem to

experimentally testable noncontextuality inequal-

ities”, PIRSA:17070059 (2017).

[46] Konrad Engel, “Sperner theory: Encyclopedia of

Mathematics and its Applications”, Vol. 65, Cam-

bridge University Press, Cambridge (1997).

[47] A. A. Klyachko, M. A. Can, S. Binicio˘glu, and

A. S. Shumovsky, “Simple Test for Hidden Vari-

ables in Spin-1 Systems”, Phys. Rev. Lett. 101,

020403 (2008).

[48] C. Held, “The Kochen-Specker Theorem”, The

Stanford Encyclopedia of Philosophy (Spring 2018

Edition), Edward N. Zalta (ed.).

[49] T. Gonda, R. Kunjwal, D. Schmid, E. Wolfe, and

A. B. Sainz, “Almost Quantum Correlations are

Inconsistent with Specker’s Principle”, Quantum

2, 87 (2018).

[50] M. Navascu´es, Y. Guryanova, M. J. Hoban,

and A. Ac´ın, “Almost quantum correlations”,

Nat. Commun. 6, 6288 (2015).

[51] A. Cabello, Adan, J. Estebaranz, and G. Garcia-

Alcaine, “Bell-Kochen-Specker theorem: A proof

with 18 vectors,” Phys. Lett. A 212, 183 (1996).

[52] E. G. Beltrametti and S. Bugajski, “A classical

extension of quantum mechanics”, J. Phys. A 28,

3329 (1995).

[53] X. Zhan, E. G. Cavalcanti, J. Li, Z. Bian,

Y. Zhang, H. M. Wiseman, and P. Xue, “Ex-

perimental generalized contextuality with single-

photon qubits”, Optica 4, 966-971 (2017).

[54] R. Kunjwal, “Contextuality beyond the Kochen-

Specker theorem”, arXiv:1612.07250 [quant-ph]

(2016).

[55] T. Fritz, A. B. Sainz, R. Augusiak, J. B. Brask,

R. Chaves, A. Leverrier, and A. Ac´ın, “Local or-

thogonality: a multipartite principle for correla-

tions”, Nat. Commun. 4, 2263 (2013).

[56] R. W. Spekkens, “Nonclassicality as the failure of

noncontextuality”, PIRSA:15050081 (2015) (see

the slide at 41:43 minutes).

[57] R. W. Spekkens, “Quasi-Quantization: Classi-

cal Statistical Theories with an Epistemic Re-

striction”, In: Chiribella G., Spekkens R. (eds)

Quantum Theory: Informational Foundations and

Foils. Fundamental Theories of Physics, vol 181.

Springer, Dordrecht.

[58] T. Vidick and S. Wehner, “Does Ignorance of the

Whole Imply Ignorance of the Parts? Large Vio-

lations of Noncontextuality in Quantum Theory”,

Phys. Rev. Lett. 107, 030402 (2011).

[59] R. Raussendorf, “Contextuality in measurement-

based quantum computation”, Phys. Rev. A 88,

022322 (2013).

[60] M. Howard, J. Wallman, V. Veitch, and J. Emer-

son, “Contextuality supplies the ‘magic’ for quan-

tum computation”, Nature 510, 351 (2014).

[61] N. Delfosse, P. A. Guerin, J. Bian, and

R. Raussendorf, “Wigner Function Negativity

and Contextuality in Quantum Computation on

Rebits”, Phys. Rev. X 5, 021003 (2015).

[62] J. Bermejo-Vega, N. Delfosse, D. E. Browne,

C. Okay, R. Raussendorf, “Contextuality as a re-

source for qubit quantum computation”, Phys.

Rev. Lett. 119, 120505 (2017).

[63] J. Singh, K. Bharti, and Arvind, “Quantum

key distribution protocol based on contextuality

monogamy”, Phys. Rev. A 95, 062333 (2017).

[64] A. Cabello, “Kochen-Specker Theorem for a Sin-

gle Qubit using Positive Operator-Valued Mea-

sures”, Phys. Rev. Lett. 90, 190401 (2003).

[65] A. Grudka and P. Kurzy´nski, “Is There Contex-

tuality for a Single Qubit?”, Phys. Rev. Lett. 100,

160401 (2008).

[66] P. Busch, “Quantum States and Generalized Ob-

servables: A Simple Proof of Gleason’s Theorem”,

Phys. Rev. Lett. 91, 120403 (2003).

[67] C. M. Caves, C. A. Fuchs, K. Manne, and

J. M. Renes, “Gleason-Type Derivations of the

Quantum Probability Rule for Generalized Mea-

surements”, Found. Phys. 34, 193 (2004).

[68] A. M. Gleason, “Measures on the closed sub-

spaces of a Hilbert space”, J. Math. Mech. 6, 885

(1957). Also available at JSTOR.

[69] P. K. Aravind, “The generalized Kochen-Specker

theorem”, Phys. Rev. A 68, 052104 (2003).

[70] A. A. Methot, “Minimal Bell-Kochen-Specker

proofs with POVMs on qubits”, Int. J. Quantum

Inf. 5, 353 (2007).

[71] Q. Zhang, H. Li, T. Yang, J. Yin, J. Du,

J. W. Pan, “Experimental Test of the Kochen-

Specker Theorem for Single Qubits using Pos-

itive Operator-Valued Measures”, arXiv:quant-

ph/0412049 (2004).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 43

[72] L. Mancinska, G. Scarpa, and S. Severini,

“New Separations in Zero-Error Channel Capac-

ity Through Projective Kochen Specker Sets and

Quantum Coloring”, IEEE Transactions on Infor-

mation Theory 59, 4025 (2013).

[73] J. Henson and A. B. Sainz, “Macroscopic non-

contextuality as a principle for almost-quantum

correlations”, Phys. Rev. A 91, 042114 (2015).

[74] D. A. Meyer, “Finite Precision Measurement

Nulliﬁes the Kochen-Specker Theorem”, Phys.

Rev. Lett. 83, 3751 (1999).

[75] A. Kent, “Noncontextual Hidden Variables and

Physical Measurements”, Phys. Rev. Lett. 83,

3755 (1999).

[76] R. Clifton and A. Kent, “Simulating quantum

mechanics by non-contextual hidden variables”,

Proc. R. Soc. Lond. A: Vol. 456, 2101-2114 (2000).

[77] J. Barrett and A. Kent, “Non-contextuality,

ﬁnite precision measurement and the

Kochen-Specker theorem”, Stud. Hist. Phi-

los. Mod. Phys. 35, 151 (2004).

[78] A. Winter, “What does an experimental test

of quantum contextuality prove or disprove?”, J.

Phys. A: Math. Theor. 47, 424031 (2014).

[79] V. B. Scholz and R. F. Werner, “Tsirelson’s

Problem”, arXiv:0812.4305 [math-ph] (2008).

[80] T. Fritz, “Tsirelson’s problem and Kirchberg’s

conjecture”, Rev. Math. Phys. 24 (5), 1250012

(2012).

Accepted in Quantum 2019-09-01, click title to verify. Published under CC-BY 4.0. 44

Comments

Products

Project