Guaranteed recovery of quantum processes from few
measurements
M. Kliesch
1,2
, R. Kueng
3,4,5
, J. Eisert
4,6,7
, and D. Gross
3,8
1
Institute for Theoretical Physics, Heinrich Heine University Düsseldorf, Germany
2
Institute of Theoretical Physics and Astrophysics, University of Gdańsk, Poland
3
Institute for Theoretical Physics, University of Cologne, Germany
4
Dahlem Center for Complex Quantum Systems, Freie Universität Berlin, Germany
5
Institute for Quantum Information and Matter, California Institute of Technology, USA
6
Department of Mathematics and Computer Science, Freie Universität Berlin, Germany
7
Helmholtz-Zentrum Berlin für Materialien und Energie, Germany
8
Centre for Engineered Quantum Systems, School of Physics, The University of Sydney, Australia
Quantum process tomography is the task of reconstructing unknown quan-
tum channels from measured data. In this work, we introduce compressed
sensing-based methods that facilitate the reconstruction of quantum channels
of low Kraus rank. Our main contribution is the analysis of a natural measure-
ment model for this task: We assume that data is obtained by sending pure
states into the channel and measuring expectation values on the output. Nei-
ther ancillary systems nor coherent operations across multiple channel uses are
required. Most previous results on compressed process reconstruction reduce
the problem to quantum state tomography on the channel’s Choi matrix. While
this ansatz yields recovery guarantees from an essentially minimal number of
measurements, physical implementations of such schemes would typically in-
volve ancillary systems. A priori, it is unclear whether a measurement model
tailored directly to quantum process tomography might require more measure-
ments. We establish that this is not the case.
Technically, we prove recovery guarantees for three different reconstruction
algorithms. The reconstructions are based on a trace, diamond, and `
2
-norm
minimization, respectively. Our recovery guarantees are uniform in the sense
that with one random choice of measurement settings all quantum channels can
be recovered equally well. Moreover, stability against arbitrary measurement
noise and robustness against violations of the low-rank assumption is guaran-
teed. Numerical studies demonstrate the feasibility of the approach.
Contents
1 Introduction 3
1.1 Motivation of our measurement model . . . . . . . . . . . . . . . . . . . . . 3
1.2 Our contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
M. Kliesch: science@mkliesch.eu, www.mkliesch.eu
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 1
arXiv:1701.03135v4 [quant-ph] 6 Aug 2019
1.4 Experimental considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.5 Notation and terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5.1 Maps on operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5.2 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.3 Spherical and unitary designs . . . . . . . . . . . . . . . . . . . . . . 8
1.5.4 Measurement terminology . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Results 9
2.1 Measurement model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Reconstructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Noiseless case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3.1 Recovery guarantees I . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Numerical demonstration . . . . . . . . . . . . . . . . . . . . . . . . 12
2.4 Stability and robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Recovery guarantees II . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.2 Numerical example: Reconstructing the Toffoli gate . . . . . . . . . 17
2.5 Pauli measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.6 Sample complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.7 Applications to fault tolerant quantum computation . . . . . . . . . . . . . 21
3 Analytical details and proofs 22
3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.1.2 Normalization and centralization of the observables . . . . . . . . . . 24
3.1.3 Minimum conic singular value . . . . . . . . . . . . . . . . . . . . . . 24
3.1.4 Null space property . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.1.5 Tools from representation theory . . . . . . . . . . . . . . . . . . . . 26
3.2 Our bound on the minimum conic singular value . . . . . . . . . . . . . . . 27
3.2.1 Upper bound on the mean empirical width W
m
. . . . . . . . . . . . 29
3.2.2 Lower bound on the marginal tail function Q
ξ
. . . . . . . . . . . . . 30
3.3 General tensor network bound . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.4 Proofs of the reconstruction theorems . . . . . . . . . . . . . . . . . . . . . 36
3.4.1 Constrained trace norm minimization . . . . . . . . . . . . . . . . . 36
3.4.2 CPT-fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.4.3 Constrained trace and diamond norm minimization . . . . . . . . . . 39
3.5 Reconstruction from approximate 4-generic measurements . . . . . . . . . . 40
3.6 Bosonic and fermionic linear optical circuits . . . . . . . . . . . . . . . . . . 43
4 Conclusion and outlook 44
4.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.2 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5 Acknowledgements 46
Appendices 46
A Semidefinite programs for trace and diamond norm reconstruction . . . . . . 46
B A non-commutative Khintchine inequality . . . . . . . . . . . . . . . . . . . 47
C Linear representation of the symmetric group S
4
. . . . . . . . . . . . . . . 47
D Proof of the tensor network bound Proposition 18 . . . . . . . . . . . . . . . 49
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 2
1 Introduction
Recent years have seen significant advances in the precise control of quantum systems.
Complex quantum states of systems with an increasing number of degrees of freedom can
be prepared and manipulated with high accuracy. In this development, it is important to
have tools at hand that allow for a complete characterization of state or process that are
actually being realized in a given experimental setup. The task of reconstructing quantum
states or process [1, 2] from experimental data is variously called quantum state or quantum
process tomography, estimation, or recovery.
The precise characterization of processes is highly relevant, e.g., in the quest for scal-
able quantum computers. The stringent requirements of the error correction threshold
and the adverse scaling of the error correction overhead in terms of the noise make
it necessary that implementations of quantum gates match their specification extremely
closely. We note that full quantum process tomography is distinct from certification pro-
tocols or coarser characterization schemes like randomized benchmarking (which reports
only a single number: a certain error rate). While this makes the latter type of protocols
much cheaper to implement, only process tomography allows one to understand in precisely
which way a quantum gate deviates from its specification.
The task of quantum process tomography important as it is comes at a high price:
This is an unfavourable scaling of the necessary effort with the system size. To learn an
unknown unstructured process acting on an n-dimensional quantum systems, m n
4
ex-
pectation values are required. However, common processes exhibit additional structure.
Most importantly, quantum gates correspond to unitary processes, which are quantum
channels with Kraus rank equal to one. Compressed sensing techniques allow one to esti-
mate channels with (approximately) low Kraus rank from significantly fewer measurements
than naively required. It is imperative to make use of this structure.
1.1 Motivation of our measurement model
The most general quantum process tomography setting allows for “coherent measurements”
of an unknown channel T . Here, m identical copies of T are available simultaneously. At
the same time, only part of the input state may be sent through the copies of T , while
another part potentially entangled with the former is left unchanged. This in turn allows
for performing measurements on T
m
id
n
m
where id
n
m
denotes the identity map on
a n
m
-dimensional additional ancillary Hilbert space that may be available rather than
“sequential” measurements on single copies of T . This includes the possibility of choosing
global, entangled input states ρ
global
, and likewise performing correlated measurements on
the respective output states (T
m
id
n
m
)(ρ
global
).
While potentially powerful, coherent measurements are arguably impractical. This is
in particular true when attempting to diagnose errors in implementations of a quantum
gates, as these are exactly the building blocks of circuits that would be used to prepare
the entangled input state in the first place.
Avoiding this drawback, we adopt an experimentally feasible “sequential” measurement
model. What is more, we resort to the situation where the local input states ρ
loc
are
transmitted via individual copies of the unknown channel T in their entirety. This setting
is often referred to as “ancilla-free” or “direct” process tomography. Data acquisition is
then concluded by performing measurements on the individual output states T (ρ
loc
).
We employ the following natural measurement setup: the unknown quantum channel
T is applied to pure input states |ψ
i
ihψ
i
|. Subsequently, the expectation value of an
observable A
i
on the output is measured. Those expectation values are estimated by means
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 3
of a suitably frequent repetition of the prescription choosing the same measurement setting.
Repeating this procedure m times for different input states |ψ
i
ihψ
i
| and observables A
i
results in a measurement vector of the form
y
i
Tr
A
i
T (|ψ
i
ihψ
i
|)
, 1 i m. (1)
For our theoretical recovery guarantees, the input states and observables are assumed to be
sampled independently from certain ensembles see below for details. We restrict attention
to pure input states, as they are expected to yield the most information about T , and also
because the preparation of pure input states is easily possible in most experimental setups.
We expect it to be straight-forward to generalize this work to take mixed input states into
account.
1.2 Our contribution
We investigate several channel reconstruction protocols that exploit the measured channel
to have an (approximately) low Kraus rank. Inspired by compressed sensing, these pro-
tocols require considerably fewer measurements of the form (1) than traditional process
tomography schemes and contain computationally tractable reconstruction algorithms. We
provide rigorous recovery guarantees featuring very desirable properties (uniform, stable,
and robust). This analytical work is complemented by a comprehensive numerical analysis
of the reconstruction protocols. In contrast to previous compressed sensing results that
are applicable to process tomography, our reconstruction protocols specifically address the
natural process tomography setup described above. Our focus is on a particularly simple
reconstruction:
i) A least squares fit of the observed data subject to an additional positivity constraint.
We call this approach CPT-fit. A distinct advantage of the CPT-fit is that the algo-
rithm does not depend on any parameters. In particular, no estimate of the true rank
or the noise strength is required. This stands in contrast to most compressed sensing-
based based reconstruction schemes. Also, the CPT-fit appears to be particularly
robust against violations of the assumption of low Kraus rank.
In order to link and compare the CPT-fit to more established low-rank matrix reconstruc-
tion methods we also investigate other reconstruction protocols. Their analytical analysis
is also the basis for the recovery guarantees of the CPT-fit.
ii) The second reconstruction method resembles a typical low-rank matrix reconstruction
protocol: minimize the trace norm subject to convex constraints that take into account
acquired data of the form (1).
iii) The third recovery algorithm closely resembles the previous one, but contains trace
preservation as an additional (linear) constraint.
iv) Otherwise similar to algorithm ii), this reconstruction protocol replaces the constrained
trace norm minimization by a diamond norm minimization. The constraints remain
unchanged. The diamond norm is a well-motivated distance measure for quantum
channels. A priori, this does not imply in any way that the diamond norm is also
a good choice as a regularizing function for channel estimation. However, Ref. [3]
provides strong analytical and numerical arguments for why one should expect the
diamond norm to outperform the trace norm as a regularizer in this setting. The
numerical studies conducted in this paper lend further credence to this claim.
v) Similar to algorithm iv), but with an additional trace preservation constraint.
We provide rigorous performance guarantees for the approaches i), iii) and v). The
guarantees are valid for measurement setups where the pure input states |ψ
i
i and the
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 4
eigenbases of the observables A
i
are drawn at random from certain ensembles. Technically,
we require the state vectors and the eigenbases to approximately form 4-designs in a
precise sense. Corresponding ensembles of unitaries can be generated efficiently using
random quantum circuits [47] or fluctuating Hamiltonians [8]. Haar-random vectors have
this property, but recent results on the fourth moments of the Clifford group [9, 10] suggest
that there are more structured and explicit ensembles with similar properties.
Even if a particular practical setup fails to show the (approximate) 4-design property,
our results are still a relevant proof of concept. Indeed, in contrast to previous compressed
sensing approaches, our reconstruction guarantees show that the specific mathematical
structure present in process tomography does not imply that a larger number of measure-
ment settings is required.
What is more, we provide numerical simulations showing that the recovery algorithms
work well for a number of measurement models not having the 4-design property. This
includes the paradigmatic case of Pauli-type measurements.
We demonstrate our channel reconstruction procedures on generic quantum channels
of varying Kraus rank, as well as the Toffoli gate. The latter is a three-qubit unitary
quantum gate that is highly relevant in quantum processing. Moreover, it has been exper-
imentally implemented in various architectures [1114]. Finally, we also comment that our
reconstruction protocols can also be applied in the setting of bosonic and fermionic linear
optical circuits.
1.3 Related work
Recent years have seen considerable advance of compressed sensing tools for the tomo-
graphic reconstruction of low-rank quantum states [1519]. To some extent, these findings
have also been adapted to cover process tomography. Such compressed sensing approaches
allow to reconstruct processes of low Kraus rank from much fewer expectation values than
a naive dimension count would suggest.
Our trace norm minimization (algorithms ii) and iii)) builds on a by now well-established
method for low-rank matrix reconstruction [15]. With the diamond norm minimization (al-
gorithms iv) and v)) we continue a line of research that has been started in Ref. [3] and
support it with rigorous performance guarantees.
The CPT-fit has been suggested in Ref. [17] and numerically compared to full tomog-
raphy and compressed sensing in Ref. [20]. These works provide numerical studies that
demonstrate the capability of such a reconstruction procedure. We expand on these ideas
by conducting additional numerical experiments. But, more importantly, we also prove
rigorous recovery guarantees for the CPT-fit. Rigorous recovery guarantees for a similar
conic fitting reconstructions in (i) a low-rank matrix recovery setup and a certain class of
measurements [21] and (ii) a sparse vector recovery setup with certain binary measurements
[22] have been proven recently.
Ref. [15] presents a process tomography protocol that is based on low-rank matrix
reconstruction with random Pauli measurements [2325]. These low rank recovery guaran-
tees can be applied to the Choi-Jamiołkowski representation of a quantum channel, since
the rank of this matrix representation equals the Kraus rank of the original channel. On
first sight, such an approach requires the use of an ancilla in order to implement the Choi-
Jamiołkowski representation physically in a concrete application [26]. However, Ref. [15]
also provides a more direct implementation of their protocol that does not require any
ancillas. Valid for multi-qubit processes this trick exploits the tensor-product structure
of (multi-qubit) Pauli operators. It allows for effectively measuring the Pauli expectation
value of a Choi matrix by performing several natural channel measurements of the form
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 5
(1) and evaluating particular linear combinations thereof in a classical post-processing
step. The demerit of this approach is that the number of individual channel measure-
ments required to evaluate a single Pauli expectation value scales with the dimension of
the underlying Hilbert space. The measurement model studied here does not require such
a coarse-graining: every natural measurement (1) itself already corresponds to a valid
measurement instance. This considerably reduces the number of different measurement
settings that are required in order to acquire sufficient data.
Randomized benchmarking has also been adapted to allow for process tomography
[18, 27, 28]. Ref. [27] adopted randomized benchmarking techniques to obtain a process
tomography protocol that is robust towards state preparation and measurement errors
(SPAM). This is a distinct advantage over other protocols that do not share this additional
feature. However, the trade of between an increase of the number of channel uses and the
gained robustness towards SPAM errors remains unclear.
Finally, Ref. [16] also considered process tomography via compressed sensing tech-
niques. However, this method is somewhat different from the other approaches presented
here: Instead of assuming low Kraus rank, they consider processes that are element-wise
sparse with respect to a known basis. Prior knowledge of this sparsifying basis is a neces-
sary prerequisite for this approach. In turn, techniques from traditional compressed sensing
[29, 30] are applied (rather than low-rank matrix reconstruction protocols) to reconstruct
such processes from a number of measurements that is proportional to the sparsity and
depends only logarithmically on the ambient system sizes. While requiring only very few
measurement settings, the main disadvantage of this approach are stronger model assump-
tions (sparsity and knowledge of the sparsifying basis) and measurements that may be
challenging to implement in practice.
1.4 Experimental considerations
In several physically important platforms, process tomography has been experimentally
realized [26, 31, 32], both in the direct and hence ancilla-free [31, 32] and ancilla-based [26]
reading of the task. These works make use of different preparations and measurements. It
should be clear, however, that in several physical architectures, random measurements of
the type discussed here can readily experimentally be implemented.
Specifically, Haar random unitary maps have been realized in a 16-dimensional Hilbert
space associated with the 6S
1/2
ground state of
133
Cs atoms [33]. This has been achieved
by suitably exploiting a time-dependent Hamiltonian evolution giving rise to Haar random
unitaries, making use of methods of quantum control. Such random unitaries have been
put to use in a tomographic protocol, in an approach of performing quantum tomography
based on Haar random unitaries that builds upon earlier theoretical ideas laid out in Ref.
[34]. The same idea of generating Haar random unitaries using suitable time-dependent
Hamiltonians should readily apply to systems of trapped ions [35], where a suitable type of
control can be achieved with present technology. Indeed, in a different context, methods of
optimal control have readily been applied to optimize quantum gates for trapped ions [36].
Random unitaries, albeit not Haar distributed, that allow for quantum state tomogra-
phy of quantum many-body systems have been theoretically considered [37], making use
of operations only that can be considered basically feasible in experiments with cold atoms
in optical lattices [38], such as optical super-lattices and time-of-flight measurements.
In integrated linear optical architectures [39, 40], Haar-random circuits are readily
conceivable [41]. It is this type of setting in which the methods presented here are most
applicable, even though they apply to any physical system in which unitary 4-designs
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 6
M
V
V
W
W
7→
J
M
V
V
W
W
7→
Tr
W
M
V
V
Figure 1: The Choi-Jamiołkowski isomorphism and partial trace in terms of tensor network diagrams
(explained in Figure 8).
Left: Order-4 tensor M L(L(V) L(W)) as a map from L(V)
=
V V
to L(W)
=
W W
.
Middle: Its Choi-matrix J(M) as an operator on W
V
=
W V.
Right: Partial trace Tr
1
[J(M)] of the Choi matrix J(M). This operator corresponds to the functional
ρ 7→ Tr[M(ρ)].
are feasible. This includes settings in which random circuits [6, 42] or randomly time-
fluctuating dynamics [8] can be used to generate approximate unitary designs.
1.5 Notation and terminology
In this section we introduce some basic preliminaries needed to understand our main results.
For any integer n Z
+
we use the notation [n]
:
= {1, 2, . . . , n}. The space of linear
operators on a vector space V is denoted by L(V).
If V is a Hilbert space, we often denote vectors by |ψ i V and its adjoints (conjugate
transposes) by hψ |. So the projector onto a normalized |ψ i is |ψ ihψ |. Moreover, the real
vector space of self-adjoint operators is denoted by Herm(V).
1.5.1 Maps on operators
We denote the space of linear maps taking operators to operators by L(C
n
)
:
= L(L(C
n
)).
The Choi-Jamiołkowski isomorphism [43, 44] J : L(C
n
) L(C
n
C
n
) takes such maps to
operators on a tensor product space and is given by (see also Figure 1 for a tensor network
description)
J(M)
:
=
dim(V)
X
i,j=1
M(|iihj |) |iihj | ,
(2)
where {|ii}
dim(V)
i=1
is an (arbitrary) orthonormal basis of V. Maps M L(C
n
) can have
different properties: M is called hermiticity preserving if it satisfies M(A
) = M(A)
for
all operators A L(C
n
). This is equivalent to demanding that J(M) is Hermitian, i.e.,
J(M) Herm(L(C
n
)). M is called trace preserving if Tr[M(A)] = Tr[A] for all A L(C
n
)
or, equivalently, if Tr
1
[J(M)] = 1, where Tr
1
: L(C
n
C
n
) L(C
n
) denotes the partial
trace over the fist tensor factor. The affine subspace of hermiticity and trace preserving
maps is denoted by HT(C
n
) L(C
n
). The identity map id
k
L(C
k
) is a particularly
simple element of HT(C
n
). Moreover, M L(C
n
) is completely positive if for all k 1 one
has that (M id
k
)(A) is positive semidefinite for every positive semidefinite operator A
L(C
n
C
k
). We will use the shorthand notation A 0 to indicate positive-semidefiniteness
of A. In fact, a map M is completely positive if and only if (M id
n
)(A) 0 (k = n).
This in turn is equivalent to J(M) 0.
The convex subset of completely positive and trace preserving (CPT) maps is denoted
by CPT(C
n
) L(C
n
). Importantly, these maps take density matrices to density matrices,
even when applied to subsystems. Therefore, they are also called quantum channels and
are a very general description of quantum processes, e.g. dynamics.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 7
1.5.2 Norms
For q [1, [, the `
q
-norm of a vector v C
n
is
kvk
`
q
:
=
n
X
j=1
|v
j
|
q
1/q
(3)
and the `
-norm is kvk
`
:
= max
j[n]
|v
j
|. The Schatten q-norm kAk
q
of an operator A
corresponds to the `
q
-norm of A’s singular values arranged in an n-dimensional vector.
Here, we will specifically encounter the following Schatten norms,
kAk
1
= Tr
AA
(trace norm), (4)
kAk
2
=
q
Tr[AA
] (Frobenius norm), (5)
kAk
= max
|ψ i
kA |ψ ik
`
2
k|ψ ik
`
2
(spectral norm). (6)
Note that rank-one projectors |ψ ihψ | are unit normalized with respect to any Schatten-p
norm.
The spectral norm is also an example of an induced norm. This concept can be gener-
alized to norms on maps M L(C
n
). In particular,
kMk
11
:
= max
A
kM(A)k
1
kAk
1
. (7)
is the induced trace norm. The diamond norm is a stabilized version of this norm:
kMk
:
= kM id
n
k
11
. (8)
This is a particularly meaningful and widely appreciated distance measure for quantum
processes [45]. Perhaps surprisingly, it can be calculated efficiently [4648] as a semidefinite
program. Regarding the reconstruction of structured maps, the diamond norm can serve
as a convex surrogate for low Kraus rank [3]. This is the case for channel reconstruction
and two of the five reconstruction protocols presented in this work build on this idea.
1.5.3 Spherical and unitary designs
The probability measures on the unitary group, respectively, the unit complex sphere that is
invariant under multiplication with any unitary is called Haar measure and is the natural
uniform measure on these sets. A complex projective t-design [4951] is a probability
distribution µ on the unit sphere that reproduces the moments of the Haar measure up to
order t in both |ψi and hψ|:
E
|ψ i∼µ
(|ψ ihψ |)
t
= E
|ψ i∼Haar
(|ψ ihψ |)
t
. (9)
Similarly, a unitary design [52, 53] is a probability distribution ν on the unitary group
satisfying
E
Uν
U
t
XU
†⊗t
= E
UHaar
U
t
XU
†⊗t
(10)
for all operators X L(C
n
)
t
. Typically, designs are considered to be uniform distributions
over finite sets.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 8
T
|ψ i
hψ |
U
U
A
0
Figure 2: The tensorial structure of one measurement setting: T is mapped to Tr
UA
0
U
T (|ψ ihψ |)
.
1.5.4 Measurement terminology
In the compressed sensing literature, a single measurement is usually given by an in-
ner product, which is a Hilbert-Schmidt inner product Tr[AT(|ψ ihψ |)] in our case. In
quantum physics terminology, (|ψ i, A) corresponds to a measurement setting, whereas a
corresponding measurement would be the von-Neumann measurement of the observable
A in the state T (|ψ ihψ |) giving rise to an expectation value Tr[AT ( |ψ ihψ |)] in the limit
of infinitely many repetitions. In order to also account for finite sample errors we always
allow for additive noise on the obtained expectation values Tr[AT ( |ψ ihψ |)].
2 Results
In this section, we explicitly specify our measurement model of natural measurements,
explain how we computationally reconstruct the quantum channels from the measurements,
state our recovery guarantees, discuss their stability and robustness properties, discuss our
numerics on Pauli-type measurements, and derive a sample complexity upper bound from
our recovery guarantees.
2.1 Measurement model
We consider the task of reconstructing a quantum channel T CPT(C
n
) from measure-
ment data of the form (1): The unknown channel receives pure states |ψ
i
ihψ
i
| as input
and subsequently expectation values of observables A
i
are measured for the output state;
see Figure 2 for a pictorial description. This procedure is repeated m times with differ-
ent measurement settings (input states and observables). Hence, the entire measurement
process leads to a measurement vector
y = A(T ) + e R
m
, (11)
with single expectation values
A(T )
i
= Tr[A
i
T (|ψ
i
ihψ
i
|)] + e
i
. (12)
The vector e R
m
denotes additive noise present in the measurement process. In contrast
to previous approaches [15, 25], no prior assumptions on the nature of this noise corruption
are required.
Throughout this work, we consider instances of random measurements {A
i
, |ψ
i
i}
m
i=1
that are independent instances of a measurement ensemble (A, |ψ i) that meets the follow-
ing requirements:
Definition 1 (4-generic measurement ensemble). We call a measurement ensemble (A, |ψ i)
with observable A and state |ψ i 4-generic if it fulfils the following criteria:
i) The distribution of |ψ i in C
n
is a spherical 4-design (9), i.e., it reproduces the first
four moments of the unitarily invariant (Haar) measure on the complex unit sphere.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 9
ii) A = UA
0
U
, where A
0
Herm(C
n
) is fixed and U in U(n) is chosen from a unitary
4-design (10), i.e., reproduces the first four moments of the Haar measure.
The measurement ensemble is called normalized 4-generic if the observables are trace-
less and normalized in spectral norm, i.e. Tr[A
0
] = 0 and kA
0
k
= 1.
Corresponding expectation values Tr[AT (|ψ ihψ |)] of a quantum channel T CPT(C
n
)
are referred to as 4-generic measurement and normalized 4-generic measurement, respec-
tively.
This definition encapsulates a variety of process measurement ensembles. In particular,
the generic measurement ensemble (|ψ i and U are Haar random states and unitaries, re-
spectively) meets the requirements by definition. However, our recovery guarantees do not
require such a strong notion of randomness in the measurement design: 4-designs are suffi-
cient. In fact, it is sufficient when the 4-designs conditions are only fulfilled approximately,
which we discuss and show in Section 3.5.
We also expect that the actual channel recovery works similarly well if |ψ i and U are
chosen from other distributions, provided that they cover the complex unit sphere and
U(n) “evenly” enough (see Section 4 for a further discussion). We confirm this expectation
with numerical simulations with Pauli-type measurements, see Section 2.5.
2.2 Reconstructions
In this section, we lay out how a quantum channel T can be reconstructed from data of
the form (11) for measurement settings given by A. We put an emphasis on describing the
fitting method T
`
2
referred to as CPT-fit. To complement this approach, we also investigate
the reconstruction methods T
η
and T
c
η
as versions of low-rank matrix reconstruction as
well as T
η
and T
c
η
as reconstruction methods based on the diamond norm.
To start with the former, we minimize the square loss under the model constrained
that T is a quantum process to obtain
T
`
2
:
= arg min{kA(T ) yk
`
2
: T CPT(C
n
)}. (13)
This minimization is essentially a fit under a CPT-constraint and we will call it CPT-fit.
It has been suggested and numerically investigated in Ref. [17]. This approach makes use
of the measurement data y alone.
More common low-rank matrix reconstruction methods in compressed sensing use the
trace norm as a so-called regularizer to favour low-rank solutions [24, 5456]. These recon-
structions do not only make use of the data y, but require an a-priori bound η > 0 on the
noise e,
kek
`
2
η . (14)
Applied to the Choi matrix J(T ) of a quantum channel T , the usual trace norm regular-
ization leads to the reconstructions
T
η
:
= arg min{kJ(T )k
1
: T L(C
n
), kA(T ) yk
`
2
η}, (15)
T
c
η
:
= arg min{kJ(T )k
1
: T HT(C
n
), kA(T ) yk
`
2
η}. (16)
These two approaches are very similar. However, the second one contains an additional
constraint that enforces hermicity and trace preservation.
The diamond norm is well-known in quantum information theory as a measure of
distinguishability of quantum channels by expectation values [57, Chapter 11], [58] and
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 10
# measurement settings m
50 100 150
# error
5 1e-05 / 100 trials
0
20
40
60
80
100
n = 4, Kraus rank r = 2
|| . ||
1
|| . ||
1
, TP-constr.
|| . ||
}
|| . ||
}
, TP-constr.
CPT fit
(a) Success rate of the reconstruction over the number of
measurements m for Kraus rank r = rank(J (T
0
)) = 2.
The rate is the number of trails out of 100 with small
errors kJ (T
rec
T
0
)k
2
10
5
, for reconstructed channels
T
rec
.
# measurement settings m
50 100 150
Average CPU time / sek.
0
20
40
60
80
100
120
140
n = 4, Kraus rank r = 2
(b) Average CPU time required for the reconstruction
in (a) over the number of measurements m.
Figure 3: Retrieval of random quantum channels T
0
CPT(C
4
) for vanishing noise e = 0. The
measurement settings A are drawn i.i.d. in each trial. The parameter η in the reconstructions (15),
(16), (18) and (18) is chosen to be ten times machine precision.
can practically be calculated and minimized using a semidefinite program [48]. For non-
obvious reasons, it can also be used as a reguralizer for the reconstruction of certain maps
on operators [3], leading to the reconstructions
T
η
:
= arg min{kT k
: T L(C
n
), kA(T ) yk
`
2
η}, (17)
T
c
η
:
= arg min{kT k
: T HT(C
n
), kA(T ) yk
`
2
η}, (18)
where η 0 is again an anticipated bound on the measurement noise.
Superior performance over simple inversion methods is expected for channels with small
Kraus rank with unitary channels being the most extreme case. Perhaps surprisingly at
first sight, the CPT-fit performs equally well as the other reconstruction methods based
on the trace (15), (16) or diamond norm (17), (18) minimization. However, a quantum
channel T has a Choi matrix J(T ) with constant trace norm kJ(T )k
1
= n. Hence, one can
omit the trace norm minimization in the the program (16) when the additional constraint
J(T ) 0 is enforced, i.e. T CPT(C
n
). Minimizing the square loss kA(T) yk
`
2
instead
then yields the CPT-fit (13).
We mention once more that existing recovery guarantees for low-rank matrix recon-
struction [24, 25, 55, 59] do not apply to the measurement model considered here. Our
measurement setting (12) carries a certain product structure (see Figure 2) that is more
restrictive than the structure covered in prior works.
2.3 Noiseless case
For sake of clarity, we first discuss the noiseless case (e = 0) and will discuss stability and
robustness in Section 2.4.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 11
2.3.1 Recovery guarantees I
We have recovery guarantees for the minimizations (16), (18), and (13), where trace preser-
vation features as a constraint.
Theorem 2 (Uniform recovery guarantees (noiseless case)). Fix r n
2
and suppose that
A : CPT(C
n
) R
m
contains
m C r n
2
ln(n) , (19)
4-generic measurements which are chosen independently from the measurement ensemble
introduced in Definition 1. Then, with probability at least 1e
λm
, all T
0
CPT(C
n
) with
Kraus rank at most r may be reconstructed exactly from noiseless measurements y = A(T
0
).
This exact reconstruction is achieved by each of the minimizations (16), (18), or (13).
Here, C and λ are universal constants (which can, in principle, be extracted explicitly from
the proof).
This statement follows from more general results, namely Theorems 19, 20, and 24
below. Stability and robustness are discussed with Theorems 3 and 4.
Remarks :
i) The number of real parameters required to specify a general hermiticity preserving
map T
0
L(C
n
) is n
4
. If said map is also trace preserving, then n
4
n
2
many
parameters are required. The same holds for T
0
CPT(C
n
). If the Kraus rank
of T
0
is rank(J(T
0
)) = r then T
0
is compressible in the sense that an order of rn
2
parameters suffices to describe T
0
. Therefore, up to the factor ln(n), the scaling (19)
of the number of measurements is optimal.
ii) The scaling (19) also contains a constant C that we have not bounded explicitly.
Numerically, we find for n = 4, 8 that m 5 r n
2
generic (Haar-random) measurements
are sufficient for recovery, see Sections 2.3.2 and 2.4.2.
Such an approach is very common in compressed sensing: Recovery guarantees, often
with unknown constants and logarithmic factors in the scaling of the number of mea-
surements, ensure the functioning of the reconstruction procedure also in the limit of
large dimensions. The precise number of measurements and the expected errors in
a given setting are often determined numerically. Here, also more special measure-
ment settings can practically be considered. In Section 2.5, we numerically investigate
Pauli-type measurements, which are not covered by our recovery guarantees.
iii) The measurements are required to be given by exact 4-designs. This condition can be
relaxed so that it is sufficient if the 4-design conditions are fulfilled only approximately.
We prove that the recovery guarantees still hold for certain /n
4
-approximate 4-designs
in Theorem 26. One can use quantum pseudorandomness generation with random
quantum circuits [57, 42] or fluctuating Hamiltonians [8] to generate -approximate
designs. In both cases, becomes exponentially small in the generation time (circuit
length/runtime), i.e., /n
4
can be made small efficiently. In this case, the measurement
map needs to be obtained from the gate sequences or the randomly fluctuating classical
parameters of the Hamiltonian, respectively.
2.3.2 Numerical demonstration
Our variational reconstructions can be recast as standard convex optimization problems
(see also Appendix A). These can be solved computationally efficiently
1
and also practically
1
with computation time scaling polynomially in the dimension
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 12
using standard software such as CVX [60, 61]. For the reconstructions (13), (16), and (18)
that have trace preservation as a constraint we have proven that a number of
m m
0
:
= C r n
2
ln(n) (20)
measurement settings is sufficient to recover quantum channels of Kraus rank r. We expect
similar reconstruction properties for the unconstrained trace and diamond norm reconstruc-
tions (15) and (17), although our proofs do not directly carry over to that case. Indeed,
low-rank matrices can often be recovered via trace norm minimization from a number of
measurements with such an essentially optimal scaling [19, 21, 24, 25, 55].
Different reconstruction procedures with the optimal scaling (20) often have a different
constant C. For instance, Ref. [62] shows that a constant of size C ' 6 provably suffices for
Gaussian measurement matrices (even without the ln(n) factor). On the other hand, the
results presented in Refs. [24, 25] are valid for more structured Pauli measurements and
require a much larger constant. Numerical studies typically highlight a similar behaviour.
Exploiting additional prior information such as the fact that quantum channels pre-
serve both hermiticity and traces in the algorithmic reconstructions (16), (18) can only
lead to an improvement. In fact, these constraints also facilitate the mathematical analysis.
Besides being able to prove recovery guarantees, we also observe the benefit of such addi-
tional constraints numerically: Figure 3a shows the recovery rates of different approaches.
This recovery rate is determined as follows. We say that a channel T
0
is successfully re-
constructed if the reconstructed channel T
rec
is close to the original one: kJ(T
rec
T
0
)k
2
10
5
. We have chosen 10
5
as threshold because we have observed the reconstruction er-
rors to be typically well separated from this value. Varying it changes the curves in the
plot only slightly. The precision of the convex optimization software CVX [60, 61] in terms
of the machine precision eps = 2
52
2.2 · 10
16
is
eps 1.4 · 10
8
. This somewhat
limits the choice of thresholds. In order to obtain the rate shown in the plots, we run
100 independent instances and obtain the rate as the percentage of trials with a successful
reconstruction.
Our numerical studies are based on generic measurement ensembles (Haar random
input states, Haar random unitaries) and we have chosen A
0
to be a diagonal n ×n matrix
whose spectrum evenly covers the interval [1, 1].
Interestingly, the unconstrained diamond norm minimization (17) performs almost as
good as its constrained counterpart (18). The heuristic reason for this behaviour is that
the diamond norm “favours” maps that satisfy a certain trace preservation condition [3, 63],
which is fulfilled for CPT maps. Moreover, the so-called descent cone at CPT-maps of the
diamond norm is contained in an intersection of descent cones of nuclear norms [3]. If this
containment is strict, it also leads to an improved recovery recovery guarantee.
Arguably the simplest reconstruction procedure is the fit under the CPT constraint
(13). Here, we observe that this protocol achieves a rate that is very similar to the other
constrained procedures (16), (18). Moreover, the CPT-fit (13) clearly has the fastest
computation time in our simple implementation in CVX [60, 61] with Mosek 7.1 as a
solver, see Figure 3b.
The required number of measurement settings (20) depends linearly on the Kraus rank
r of the channels to be reconstructed. This dependence is confirmed for small r in our
numerics for Hilbert space dimension n = 4, see Figure 4. For dimension n = 4 the Kraus
rank is r n
2
= 16 and for r = 16, m = dim
R
(HT(C
n
)) = n
4
n
2
= 252 of measurement
values are required for the reconstruction. Here, dim
R
(HT(C
n
)) denotes the real dimension
of the affine space HT(C
n
). This observation explains the non-linear behaviour for larger r.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 13
CPT -t: n = 4
Kraus rank r
2 4 6 8 10 12 14 16
# measurement settings
m
50
100
150
200
250
# error
5 1e-05 / 100 trials
0
20
40
60
80
100
Figure 4: Reconstruction of random quantum channels T
0
CPT(C
4
) with different Kraus ranks
r = rank(J(T
0
)) from the CPT-fit (13). The white region corresponds to 100% observed recovery and
the black one to 0%. Parameters and measurements are as in Figure 3.
2.4 Stability and robustness
We consider two types of errors: i) the measurement errors e already indicated in the
measurement model (12) and ii) model mismatches capturing violations of the assumption
rank(J(T
0
)) r on the Kraus rank.
For this purpose, we need a technical definition of optimal low-rank approximations of
quantum channels. We fix a maximum rank r. For any operator X Herm(C
d
) we define
X
|r
:
= arg min
kX Y k
1
: Y Herm(C
d
), rank(Y ) r
, (21)
X
¬r
:
= X X
|r
. (22)
The low-rank approximation X
|r
and, hence, also X
¬r
has a simple formula: we use the
eigenvalue decomposition X =
P
d
i=1
x
i
|x
i
ihx
i
| with |x
1
| |x
2
| ··· |x
d
|. Then
X
|r
=
r
X
i=1
x
i
|x
i
ihx
i
| . (23)
This construction is inherited by linear maps on operators. Specifically, for T
CPT(C
n
) we set
J(T
|r
)
:
= J(T )
|r
and J(T
¬r
)
:
= J(T )
¬r
. (24)
The error term T
¬r
is called the model mismatch. In turn, we relax our model assumption
of low Kraus rank and allow for small model mismatches.
A reconstruction is called stable if it tolerates measurement errors and it is called robust
if it tolerates model mismatches.
2.4.1 Recovery guarantees II
We prove all reconstructions from Theorem 2 to be stable against measurement noise.
Moreover, we prove the trace norm minimization (16) and the CPT-fit (13) also to be
robust against model mismatches.
Theorem 3 (Stability of the diamond norm reconstruction (18)). Consider normalized 4-
generic measurements from Definition 1. Then, under the hypotheses of Theorem 2, recon-
struction via the constrained diamond norm minimization (18) is stable towards additive
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 14
noise in the measurements y = A(T
0
) +e: for any η kek
`
2
, the associated reconstruction
error obeys
J(T
0
T
c
η
)
2
˜c
n
2
m kA
0
k
2
η . (25)
The constants C, λ, and ˜c only depend on each other.
A more general version of this theorem which allows for quantifying the noise strength
by any `
q
-norm is also true, see Corollary 24 below.
Theorem 4 (Stability and robustness, trace norm minimization (16) and CPT-fit (13)).
Under the hypotheses of Theorem 2, reconstruction via the CPT-fit (13) is both stable
towards additive noise corruption and robust with respect to relaxing the model assumption
of Kraus rank r:
J(T
0
T
`
2
)
2
4 kJ(T
0
¬r
)k
1
+ ˜c
n
2
m
A
0
2
kek
`
2
. (26)
This performance guarantee is also valid for the constrained trace norm minimization (16)
if one replaces kek
`
2
by the optimization parameter η, provided that η kek
`
2
Once more, the constants C, λ, and ˜c again only depend on each other.
The model mismatch is quantified by the trace norm of the Choi matrix. This is
a relatively simple error measure that upper bounds the operationally relevant diamond
norm.
Theorem 4 summarizes two results Theorems 19 and 20 below that are slightly more
general: they allow for quantifying the reconstruction error (26) and by any Schatten-p
norm with p [1, 2]. Moreover, the noise strength may be characterized by any `
q
-norm:
η kek
`
q
.
Naturally, one would expect the statements of the recovery guarantees from Theorem 4
to be invariant under i) a simultaneous rescaling of the noise strength η and the measured
observables and ii) under adding a multiple of the identity operator to the observables and
shifting the measurement data accordingly. The following corollary makes that expectation
precise.
Corollary 5 (Normalization of the measurements). Similar statements as in Theorems 3
and 4 hold also in the case of 4-generic measurements, provided that η is replaced by
η/ kA
0
k
and kA
0
k
2
is replaced by the 2-norm of the traceless part of A
0
.
Remarks :
i) Theorem 4 guarantees approximate reconstructions for any quantum channel T
0
CPT(C
n
) without requiring an a priori rank constraint. The deviation T
0
¬r
from a
linear map of low Kraus-type rank, i.e., the model mismatch, enters the error bound
linearly. Note that we measure the model mismatch w.r.t. all rank-r linear maps,
rather than w.r.t. only the rank-r quantum channels. This gives a more favorable
error bound. Such a robustness is desirable for practical applications where the model
assumption of low Kraus rank is typically only approximately true.
ii) For the constrained norm minimizations (18) and (16), a prior error threshold is re-
quired. This additional model selection task is important in actual applications and
can be a non-trivial task; see c.f., Ref. [64]. The CPT-fit (13) has the distinct advan-
tage that the reconstruction can be done without such a prior noise estimation.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 15
iii) Theorems 19 and 20 quantify the reconstruction error in terms of Schatten p-norms.
One may obtain diamond norm estimates from these results via the well-known in-
equalities between the diamond norm and the Schatten norms (c.f. e.g. Ref. [63]). A
direct derivation of diamond norm bounds does not seem to be achievable with the
proof methods we employ, which rely heavily on properties of the various Schatten
norms.
iv) We emphasize that the dimensional scaling of the error bound in Eq (26) is a con-
sequence of the normalization we adopted. To make this precise, it is useful to view
these statements as results about stably reconstructing (particular) Choi matrices
J(T ) L(C
n
C
n
)
=
L(C
n
2
). Rewriting the single expectation values (12) as
y
i
= Tr
h
A
i
(|ψ
i
ihψ
i
|)
T
J(T )
i
+ e
i
(27)
reveals that all individual measurement matrices M
i
= A
i
(|ψ
i
ihψ
i
|)
T
L(C
n
2
)
have constant Frobenius norm
kM
i
k
2
=
A
i
(|ψ
i
ihψ
i
|)
T
2
= kA
0
k
2
. (28)
Choosing a particular normalization influences the signal-to-noise ratio (SNR) in (27)
and, in turn, the stability assertions. In order to avoid these ambiguities, it is useful
to rewrite Eq. (26) for vanishing model mismatch as
J(T
0
T
c
η
)
2
˜c
m d
P
m
i=1
kM
i
k
2
kek
`
2
(29)
with d
:
= dim
C
n
2
. Note that the sum scales like m. Up to our knowledge, all
stable low-rank matrix reconstruction guarantees are essentially
2
of this form! This in
particular includes Gaussian measurement ensembles [56, Theorem 2.3], random Pauli
matrix measurements [25, Proposition 2.3] and outer products of standard Gaussian
vectors [19, Theorem 2].
v) Regarding sample complexity, an order of rn
5
ln(n)/
2
independent channel evalua-
tions (“samples”) are required to achieve a reconstruction error of at most in Frobe-
nius norm; see Section 2.6 below. We expect this scaling to be close to optimal up
to ln(n)-factors (at least for channels with very low Kraus rank r). Evidence for
this is provided by the discussion above: the stability guarantees in Theorem 3 and
Theorem 4 essentially match the best existing results on stable low-rank matrix recon-
struction. Among these are two that are applicable to the related problem of quantum
state tomography: random Pauli measurements [25] and outer products of standard
Gaussian vectors [19].
The sample complexity of the former approach has been determined in Ref. [15].
Moreover, it has been shown to be close to optimal in the sense that it reproduces
a fundamental lower bound valid for any tomographic procedure based on Pauli
measurements up to a single ln(n)-factor.
Fundamental lower bounds on the sample complexity achievable by any tomographic
procedure have been derived in Ref. [65]. Said work also determines the sample com-
plexity associated with measuring outer products of standard Gaussian vectors. Inter-
estingly, this sample complexity matches the fundamental lower bound up to a factor
of r ln(n), where r denotes the rank of the density operator in question. Thus, state
2
up to log-factors
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 16
tomography via low-rank matrix recovery from outer products of Gaussian vectors is
close to optimal, at least for states that have very low rank.
We expect that similar results are true for quantum process tomography via low-
rank matrix reconstruction of Choi matrices. However, while conceptually similar, the
results regarding sample complexities of state tomography procedures [15, 65] are not
directly applicable to process tomography. We intend to address such an extension in
future work.
2.4.2 Numerical example: Reconstructing the Toffoli gate
The Toffoli gate is a three qubit gate that has drawn a lot of attention from theorists as
well as experimentalists. It is universal for classical computation and, together with the
Hadamard gate, also for quantum computation [66]. Moreover, it has played an important
role in the theory of gate sets [67] and can help to significantly reduce the number of gates
in quantum algorithms, such as in quantum error correction. It also has been implemented
experimentally in nuclear magnetic resonance [11], linear optics [12], trapped ions [13], and
in superconducting circuits [14]. Moreover, the reconstruction using the CPT-fit (13) has
already been compared to full tomography [20]. Hence, the Toffoli gate is a good candidate
to benchmark our process tomography schemes.
We demonstrate uniform recovery, i.e., we draw a fixed number of measurement settings
at random and keep them fixed throughout the simulation in Figure 5. Then, we always
use the first m of them for reconstructions with m settings.
Since reconstruction via the unconstrained trace norm minimization (15) is numerically
inferior to the other reconstructions (Figure 3) we will not investigate it in all simulations.
There are two types of error sources: (i) imperfect measurements give rise to measure-
ment noise e R
m
in our model (12) and (ii) the implemented channel T
0
CPT(C
n
)
could have violated the model assumption of a low Kraus rank. Here, we confirm our
analytic stability result towards both error sources numerically.
Our theorems put minimal assumptions on the potential noise corruption e R
m
.
In particular, it does not need to follow a specific statistical model. For our numerical
analysis, however, we draw the measurement noise e i.i.d. from a zero-mean Gaussian
distribution and scale it so that the noise strength kek
`
2
has the desired value (Figure 5,
left). A Gaussian error model frequently occurs in practice: When estimating expectation
values from observed frequencies such an error model arises naturally in the limit of many
measurements per setting. In practice, the parameter η needs to be estimated for the
reconstructions (15), (16), (18) and (18). Here, one could take the smallest η so that the
reconstructions succeeds. The CPT-fit (13) has the advantage of not requiring such an
estimation.
We reconstruct T
Toff
from m = 320 noisy measurements with different values of kek
`
2
without model mismatch (λ = 0), see Figure 5(left). For the trace and diamond norm
reconstructions (16), (17), and (18) we set the error parameter η = kek
`
2
+ 10 eps, where
eps = 2
52
2 ·10
16
is the machine precision. As predicted by Theorems 3 and 4, the re-
construction error kJ(T
rec
T
0
)k
2
scales linearly in the noise strength kek
`
2
, see Figure 5b.
Here, the CPT-fit (13) has the smallest reconstruction error. If one further increases the
number of measurements, then the reconstruction error kJ(T
rec
T
0
)k
2
decreases further,
as guaranteed by Theorems 3 and 4.
In order to demonstrate robustness of our reconstructions, we set T
0
to be a convex
combination of the Toffoli gate T
Toff
and a completely depolarizing channel T
dep
(ρ) =
n
1
Tr(ρ)1,
T
0
= (1 λ)T
Toff
+ λ T
dep
, (30)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 17
where λ [0, 1]. The depolarizing channel corresponds to a physically relevant error
model that maximally violates our model assumption of low Kraus rank; see, e.g., Ref. [68,
Chapter 8.3.4].
We test the reconstruction of T
0
for different values of λ [0, 1]. For the sake of clarity,
we completely suppress additive noise (e = 0) and set the error threshold η = 10 eps. The
results are presented in Figure 5(right) and demonstrate the robustness guaranteed by
Theorem 4. The reconstruction error kJ(T
rec
T
0
)k
2
depends roughly linearly λ. Here,
the diamond norm minimizations (17) and (18), perform worse than the constrained trace
norm minimization (16) and the CPT-fit (13).
In the case of measurement noise, the reconstruction error kJ(T
rec
T
0
)k
2
approaches
the optimal value kek
`
2
in the limit of large m, see Figure 5a(left). In the case of a model
mismatch, the reconstruction error decreases below the mismatch parameter λ and vanishes
if m approaches the full dimension of CPT(C
n
), see Figure 5a(right).
We find it worthwhile to share one interesting numerical observation on the uncon-
strained diamond norm reconstruction (17). For this purpose we assume that T
0
is a
quantum channel, i.e. T
0
CPT(C
n
), so that kT
0
k
= 1. Then the optimal value kT
rec
k
of the reconstruction seems to decrease with the reconstruction error kJ(T
rec
T
0
)k
2
, see
Figure 5c. Hence, this reconstruction procedure does not only yield good approximations
to the measured channel T
0
, but its optimal value also provides some indication of the
reconstruction error’s size.
One can exploit this observation by using both the CPT-fit (13) and the unconstrained
diamond norm reconstruction (17). The CPT-fit is the fastest reconstruction procedure and
yields the smallest error. Complementing this, the diamond norm minimization provides
an indication of what the error might be. In the CPT-fit, the reconstructed map T
rec
is
always a quantum channel, i.e., T
rec
CPT(C
n
). In contrast, the solution of the diamond
norm minimization can, in principle, be any map in L(C
n
). The latter also holds true
for the unconstrained trace norm reconstruction (15), but we could not observe a similar
feature of its minimum value.
The error bounds from Theorem 4 suggest that observables with larger Frobenius
norm have a better noise suppression, as kA
0
k
2
appears in the denominator in the er-
ror bound (26). We also tested this behaviour numerically in order to demonstrate that it
is not just a proof artifact, but an actual feature. We choose A
0
to have r
A
many non-zero
eigenvalues, which we evenly distribute in the interval [1, 1]. These A
0
have a Frobenius
norm in the interval [1, 2]. Figure 6 shows the average reconstruction error of the Toffoli
gate for non-uniform measurements in dependence of kA
0
k
2
and noise strength kek
`
2
= 0.1.
This numerical analysis demonstrates that the reconstruction error can indeed be reduced
with increasing kA
0
k
2
.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 18
# measurement settings m
300 400 500 600 700 800 900 1000
|| T
rec
- T
0
||
F
0
0.05
0.1
0.15
0.2
0.25
0.3
To,oli gate, kek
`
2
= 0.05
|| . ||
1
, TP-constr.
|| . ||
}
|| . ||
}
, TP-constr.
CPT fit
# measurement settings m
300 400 500 600 700 800 900 1000
|| T
rec
- T
0
||
F
0
0.1
0.2
0.3
0.4
0.5
0.6
To,oli gate, 6 = 0.05
|| . ||
1
, TP-constr.
|| . ||
}
|| . ||
}
, TP-constr.
CPT fit
(a) The reconstruction error over the number of measurement settings m for fixed measurement noise kek
`
2
= 0.05
(left) and fixed model mismatch λ = 0.05 (right), respectively, in a uniform recovery setting.
Measurement noise || e ||
F
0 0.05 0.1 0.15 0.2
|| T
rec
- T
0
||
F
0
0.2
0.4
0.6
0.8
1
To,oli gate, m=320
|| . ||
1
, TP-constr.
|| . ||
}
|| . ||
}
, TP-constr.
CPT fit
Model mismatch 6
0 0.05 0.1 0.15 0.2 0.25 0.3
|| J(T
rec
- T
0
) ||
1
0
0.5
1
1.5
2
To,oli gate, m=320
|| . ||
1
, TP-constr.
|| . ||
}
|| . ||
}
, TP-constr.
CPT fit
(b) The reconstruction error over the measurement noise kek
`
2
and model mismatch λ, respectively, for m = 320
fixed measurement settings.
|| T
}
2
- T
0
||
F
0 0.2 0.4 0.6 0.8 1 1.2 1.4
|| T
}
2
||
}
0.96
0.97
0.98
0.99
1
1.01
To,oli gate, m=320
|| T
}
2
- T
0
||
F
0 0.5 1 1.5 2 2.5 3 3.5 4
|| T
}
2
||
}
0.9
0.95
1
To,oli gate, m=320
(c) The optimal value
T
η
from the minimization (17) over the reconstruction error
J (T
η
T
0
)
2
achieved by
the unconstrained diamond norm minimization (17) used in (b).
Figure 5: Uniform recovery of the three qubit Toffoli gate T
0
CPT(C
8
) in imperfect settings. In the per-
fect setting m = 320 measurement settings are sufficient for reconstruction w.h.p. while the total dimension is
dim(CPT(C
8
)) = 4032.
Left: Reconstruction of T
0
= T
Toff
from y = A(T
0
) + e and A with measurement noise e R
m
being drawn
uniformly from a scaled sphere and without model mismatch (λ = 0). The parameter η is chosen to be 10 times
machine precision plus the chosen noise strength kek
`
2
.
Right: Reconstruction of T
0
from y = A(T ) and A with T
0
= (1 λ) T
Toff
+ λ T
dep
, where the model mismatch
is caused by the completely depolarizing channel T
dep
.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 19
|| A ||
F
1 1.2 1.4 1.6 1.8 2
|| T
rec
- T
0
||
F
0.2
0.3
0.4
0.5
0.6
0.7
Average over 100 reconstructions via CPT -t
Figure 6: The plot shows the average reconstruction error
J(T
`
2
T
0
)
1
for observables with different
rank. The observables’ non-zero eigenvalues cover evenly the interval [1, 1], giving rise to a Frobenius
norm increasing monotonically with the rank. Plotted are CPT-fit (13) reconstructions of the Toffoli
gate from m = 320 i.i.d. measurements. The noise is scaled to kek
`
2
= 0.1.
The plot highlights advantageous stability properties of non-degenerate and thus high rank observ-
ables.
2.5 Pauli measurements
Our recovery guarantees hold for 4-generic measurements. However, these measurements
can be challenging to implement in many experimental situations. So, how do our recov-
ery schemes perform for more more restricted measurement ensembles? In this section
we numerically investigate two practically relevant measurement scenario of Pauli-type
measurements.
For quantum state tomography [15], Pauli measurements are proven to satisfy the
so-called restricted isometry property for rank-r matrices in L(C
d
) for a number of mea-
surement settings scaling as r d log(d)
6
[25].
Motivated by this strong statement, we numerically investigate process measurements
that are inspired by a Pauli setting. We denote the set of Pauli strings by P U(2
L
), i.e.,
the set of operators P = σ
(1)
σ
(2)
··· σ
(L)
with Pauli matrices σ
(i)
{1, σ
x
, σ
y
, σ
z
}
for i [L]. We write P P for a Pauli string that is drawn uniformly at random from P.
Then we choose the measurements as
y
j
:
= Tr[P
j
T
0
(|ψ
j
ihψ
j
|)] , (31)
where observables P
j
and input states |ψ
j
i are i.i.d. selected as follows. Each P
j
P is a
uniformly drawn Pauli string and each state vector |ψ i
j
is a tensor product of uniformly
i.i.d. drawn eigenvectors of random Pauli operators {σ
x
, σ
y
, σ
z
} (hence, an eigenvector of
the corresponding Pauli string).
Numerically, we observe that for random unitary quantum channels our reconstructions
perform very similar for these Pauli measurements and the generic (Haar-random) mea-
surements (not shown in the plots). However, for non-generic channels we observe that the
two types of measurements lead to different reconstruction behaviours, see Figure 7: For
the reconstruction of the Toffoli gate, more Pauli-type-measurements than generic mea-
surements are required in the case of unconstrained trace norm regularization (15). In
contrast, fewer Pauli-measurements are required for the other regularizations.
We have also observed that reconstructions from Pauli-measurements have similar sta-
bility properties as the ones in the generic case (not shown in the plots).
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 20
# measurement settings m
0 50 100 150 200 250 300 350 400
|| T
rec
- T
0
||
F
0
2
4
6
8
10
Random unitary T
0
, Pauli expectation values
|| . ||
1
|| . ||
1
, TP-constr.
|| . ||
}
|| . ||
}
, TP-constr.
CPT fit
# measurement settings m
0 50 100 150 200 250 300 350 400
|| T
rec
- T
0
||
F
0
2
4
6
8
10
To,oli gate, Pauli expectation values
|| . ||
1
|| . ||
1
, TP-constr.
|| . ||
}
|| . ||
}
, TP-constr.
CPT fit
Figure 7: Pauli-type measurements: Comparison of the reconstruction errors of generic (Haar-random)
unitary quantum channels (left) and the Toffoli gate (right).
The identity quantum channel T
0
= id is an extreme case in the sense that it is sparse
in any basis and commutes with all transformations. For generic measurements we have
observed that it has the same reconstruction behaviour as Haar-random unitary T channels.
In comparison, in case of Pauli measurements, the unconstrained trace norm reconstruction
works worse and the other reconstructions better (not shown in the plots).
2.6 Sample complexity
The expectation values in our measurements need to be estimated from finite samples
leading to a reconstruction error e. Assume we want to estimate the measured channel
T
0
CPT(C
n
) in Frobenius norm up to an error kek
`
2
. Then the sample complexity is
the scaling of the optimal number of measurements in the ideal setting, i.e., without model
mismatch or measurement errors.
We use the Landau symbols O and to denote the usual asymptotic upper and lower
bounds, respectively. The Landau symbol
˜
O denotes the same scaling as O up to log-
factors.
The error of an empirical mean of Tr[A
i
T
0
(|ψ
i
ihψ
i
|)] form ` samples scales as O(1/
`)
w.h.p. (due to, e.g., Chebyshev’s inequality). This gives rise to an error vector e bounded
as
kek
`
2
O(
q
m/`) . (32)
Hence, the total estimation error bounded as
n
2
kek
`
2
m
A
0
2
(33)
is small w.h.p. for ` `
0
with some `
0
being bounded as `
0
O
n
4
kA
0
k
2
2
2
. For
A
0
2
Ω(
n) this yields a total sample complexity scaling as
˜
O(rn
2
) O(n
3
/
2
) =
˜
O(rn
5
/
2
).
2.7 Applications to fault tolerant quantum computation
The techniques presented here also have applications for fault tolerant quantum compu-
tation. Threshold theorems [6971] give a theoretical guarantee that quantum computers
can be built in principle if the noise strength is below some threshold value. The strength
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 21
of the errors needs to be quantified in diamond norm distance which is not directly accessi-
ble. Instead one typically evaluates average error rates using direct fidelity estimation [72],
or randomized benchmarking, see, e.g., Ref. [73]. However, these two error measures can
differ by orders of magnitude [74]. This, in particular, is the case for coherent error sources,
such as unitary over/under-rotations [75], where the diamond distance is proportional to
the square root of the average error rate. Thus, achieving fault-tolerance thresholds in
the presence of unitary noise requires an exceedingly high error control (around 10
8
for
typical threshold levels of a few times 10
4
). In contrast, other noise sources typically
imply a much more favourable (linear) relation between both error measures.
From a practical perspective, these results are encouraging: Unlike incoherent noise,
coherent noise effects can typically be corrected. A necessary subroutine for achieving
this goal is to accurately estimate these error channels. Our results considerably simplify
this task, in particular for unitary errors which have Kraus rank one. They provide esti-
mation techniques that require considerably fewer measurements than traditional process
tomography protocols.
3 Analytical details and proofs
Our analytical results build on by now well-established mathematical proof techniques for
low-rank matrix reconstruction [19, 21, 62]. The main technical contribution of this work
is to extend these techniques to natural measurements whose structure deviates from less
structured measurement matrices common in low-rank matrix reconstruction.
Starting point of our analysis is a geometric approach to low-rank matrix recovery
presented in Ref. [62]. It relates the reconstruction error from a constrained trace norm
minimization to a certain quantity associated with the measurement map A: the mini-
mum conic singular value; see Definition 8 below. We bound this quantity by invoking
Mendelson’s small ball method [76, 77] a strong probabilistic tool that depends on certain
concentration properties of the measurement ensemble. For 4-generic measurements these
are derived using representation theory of the unitary group and general bounds on ten-
sor network contractions besides probabilistic bounds commonly used on low-rank matrix
reconstruction.
This geometric analysis results in a reconstruction guarantee for the constrained trace
norm minimization (16) that is stable towards additive noise corruption. In turn, the
geometric arguments provided in Ref. [3] suggest a strengthening of the obtained error
bounds, if one replaces the trace norm by the diamond norm. Theorem 3 or, more
generally: Theorem 23 and Corollary 24 are consequences of such an approach.
Our second main result Theorem 4 assures stability towards noise corruption as
well as robustness with respect to violating the model assumption of low Kraus rank.
This further improvement is achieved via establishing a robust version of the null space
property for 4-generic measurements. We refer to Section 3.1.4 for a brief introduction.
The technical prerequisites for such an approach are largely identical to the ones associated
with the more geometric framework outlined above. Mendelson’s small ball method, in
particular, is again applicable. Hence, relatively little additional effort is required for this
approach which has the added benefit of ensuring robustness towards model mismatches.
The remainder of this section is organized as follows: After some preliminaries, we prove
a bound on the minimum conic singular value for the case of our measurement setting in
Section 3.2. The proof used a general bound on tensor networks, which we state and prove
in Section 3.3. In Section 3.4, we state and prove general versions of our main theorems.
Finally, in Section 3.6, we show that our results also hold for a quantum linear optical
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 22
setting.
3.1 Preliminaries
Before we come the to proofs we introduce some helpful notation, discuss the minimum
conic singular value and a useful bound to it, introduce the null space property and a
subsequent recovery guarantee, and explain the use of representation theory for bounding
certain moments.
3.1.1 Notation
A vectorization of a tensor is a vector containing all its elements. The Frobenius norm
kT k
F
of a tensor T is defined to be the `
2
-norm of some vectorization of T . Importantly,
for an operator A L(C
n
) it holds that kAk
F
= kAk
2
.
The permutation group on k elements is denoted by S
k
and the unitary group by
U(n) L(C
n
). Its representation on (C
n
)
k
is denoted by R, so that for σ S
k
the
representation R(σ) U(n
k
) acts on (C
n
)
k
by permuting the k tensor copies according
to σ.
Affine spaces : In order to deal with the constraint that the reconstructed maps are
trace preserving we need the affine version of the usual reconstruction problem [62] in
compressed sensing.
It is helpful to extend the basic algebraic operations to sets: E.g., we denote the
Minkowski sum of subsets S and R of some vector space by S + R
:
= {s+r : s S, r R}.
Similarly, we define R and s + R for some s S.
An affine space V over the field R is a subset of a vector space over R such that for all
λ R and x, y V one has (1 λ)x + λy V. Then V
0
:
= V V is a vector space and
V = x + V
0
for any x V. The linear span lin(V) of V is another vector space containing
both V and V
0
.
A map A : V W between affine spaces V and W is called affine if for all λ R and
x, y V one has A((1 λ)x + λy) = (1 λ)Ax + λAy.
Given an affine map A : V R
m
(such as the measurement map defined in (11)) one
can extend it linearly to lin(V). This extension will also be denoted by A.
Maps and operators : Any map M L(C
n
) and operators A, ρ L(C
n
) satisfy the
identity
Tr[A M(ρ)] = Tr[A ρ
T
J(M)] . (34)
We will use this identity to write expectation values of the type Tr[AT (ρ)] in terms of the
Choi matrix J(T ).
By M
we denote the Hilbert-Schmidt adjoint of M L(C
n
) and by M
?
the map which
obeys M(X)
= M
?
(X
) for all X L(C
n
). These two involutions satisfy M
?
= M
?
.
Moreover, we use the notation
M
k,l
:
= M
k
M
?l
. (35)
We note in passing that CPT maps T CPT(C
n
) have diamond norm kT k
= 1 and
Choi matrices are normalized as kJ(T )k
1
= n.
It will also be helpful to define the set containing all Choi matrices
J HT(C
n
) = {X Herm(C
n
C
n
) : Tr
1
(X) = n 1
n
}, (36)
where Tr
1
denotes the partial trace over the first tensor factor.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 23
3.1.2 Normalization and centralization of the observables
We assume the fixed observable A
0
to be normalized (kA
0
k
= 1) and centered (traceless):
Tr[A
0
] = 0. Such restrictions somewhat simplify the technical analysis; the following
observations highlight that little generalization is lost by imposing them:
Observation 6 (Uncentered observables). Fix A
0
Herm(C
n
) and let
˜
A
0
:
= A
0
Tr[A
0
]1/n (37)
be the traceless part of A
0
. Then, the associated 4-generic measurement maps A and
˜
A introduced in Definition 1 do not necessarily coincide. However, the algorithmic re-
constructions T
rec
and
˜
T
rec
of any trace-preserving map T nonetheless coincide for all
reconstruction procedures (15), (16), (17), (18), or (13).
Proof. Since any T CPT(C
n
) is trace preserving, we have that
˜
A(T )
j
= Tr[
˜
A
j
T ( |ψ
j
ihψ
j
|)]
= Tr[A
j
T ( |ψ
j
ihψ
j
|)] Tr[A
j
]/n (38)
= A(T )
j
Tr[A
j
]/n.
Together with the definition of y and ˜y this implies
A(T ) y =
˜
A(T ) ˜y . (39)
But this means that the corresponding minimizations are the same.
The following observation says that the observable’s spectral norm suppresses the noise.
Observation 7 (Unnormalized observables). Let A be a 4-generic measurement and de-
fine
ˆ
A := kA
0
k
1
A. Then the algorithmic reconstructions associated with measurements
y = A(T
0
) + e and ˆy =
ˆ
A(T
0
) +
e
kA
0
k
(40)
coincide for all five reconstruction procedures: (15), (16), (17), (18) and (13).
3.1.3 Minimum conic singular value
The usual minimum conic singular value of a map A can be written variationally as the
right hand side (RHS) of Eq. (41) with K being the full space and q = 2. In order for
A to be invertible, the minimum conic singular value needs to be positive. Moreover, for
fixed spectral norm of A, the larger the minimum conic singular of A the more stable A
can be inverted. An extension of these basic concepts can be extended to realm of convex
analysis, which motivates the following Definition.
Definition 8 (`
q
-minimum conic singular value). Consider an affine space V where lin(V)
is equipped with a norm k·k. Let A : V R
m
be an affine linear map and K V
0
be a
cone. The minimum singular value of A w.r.t. K, measured in `
q
-norm with q 1, is
λ
min
(A; K; `
q
)
:
= inf
uK
kAuk
`
q
kuk
, (41)
where A has been extended to lin(V).
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 24
Typically, one chooses q = 2 in order to define the minimum conic singular value [62].
Here, we opt for a more general definition that allows for adjusting stability guarantees
towards noise to different types of noise models.
The conic singular value of our measurement map A (see Eq. (12)) w.r.t. a cone K
L(C
n
) can be written as
λ
min
(A; K; `
q
) = inf
ME
m
X
i=1
Tr[A
i
M(|ψ
i
ihψ
i
|)]
q
!
1
q
(42)
with E = {M K : kMk
F
= 1}.
We use a method established by Mendelson [76, 77] in order to bound the minimum
conic singular value. As suggested in Ref. [21, Remark 5.1], we use a generalization of
Tropp’s version [62, Proposition 5.1].
Lemma 9 (Bound on λ
min
). Let E HT(C
n
) be a cone of maps and (|ψ
i
i, A
i
)
i[m]
be
an i.i.d. measurement settings. Define the marginal tail function
Q
ξ
:
= inf
ME
P
Tr[A
i
M(|ψ
i
ihψ
i
|)]
ξ
(43)
(the same for all i) and mean empirical width
W
m
(E)
:
= E sup
ME
1
m
m
X
i=1
i
Tr[A
i
M(|ψ
i
ihψ
i
|)] , (44)
where
i
{−1, 1} are independent uniformly random signs. Then, for any ξ > 0, λ > 0,
and q 1
inf
ME
m
X
i=1
|Tr[A
i
M(|ψ
i
ihψ
i
|)]|
q
!
1
q
m
1
q
1
2
ξ
mQ
2ξ
2W
m
ξλ
(45)
with probability at least 1 e
λ
2
/2
.
Proof. Following Ref. [21, Remark 5.1], we point out that the proof of Ref. [62, Proposi-
tion 5.1] actually implies the stronger statement
1
m
inf
ME
m
X
i=1
|Tr[A
i
M(|ψ
i
ihψ
i
|)]| ξ
mQ
2ξ
2W
m
ξλ . (46)
Using that kvk
`
1
m
1/q
kvk
`
q
for any v C
m
results in (45) for any q 1.
3.1.4 Null space property
If an operator X is of rank r one has kXk
1
r kXk
2
. Here, we rely on the idea to
take a similar inequality to define the notion of effectively rank-r elements. This notion
of effective low rank is often enough for low-rank matrix reconstruction. Additionally, one
can take into account violations of X being of low rank. This idea is formalized with the
null space property (NSP). We use a version of Kabanava et al. [21] adjusted to our setting
of subspaces, which uses the rank-r approximation X
|r
to X from Eq. (24).
Definition 10 (Robust NSP for subspaces). Fix a subspace V
0
Herm(C
d
), r Z
+
, and
q 1. We say that a linear map A : Herm(C
d
) R
m
satisfies the (V
0
, r, `
q
)-NSP with
parameters µ ]0, 1[ and τ > 0 if for all X V
0
X
|r
2
µ
r
kX
¬r
k
1
+ τ kA(X)k
`
q
. (47)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 25
We will use the following result from Ref. [21] in our setting. It yields recovery guar-
antees based on the NSP.
Theorem 11. Let V Herm(C
d
) be an affine space and set V
0
:
= VV and fix p [1, 2].
If a linear map A : Herm(C
d
) R
m
satisfies a (V
0
, r, `
q
)-NSP with parameters µ ]0, 1[
and τ > 0 then
kX Y k
p
(1 + µ)
2
(1 µ)r
11/p
kY k
1
kXk
1
+ 2 kX
¬r
k
1
+ τ r
1/p1/2
3 + µ
1 µ
kA(X Y )k
`
q
.
(48)
for all X, Y V.
This is a straightforward adaptation of the proof of [21, Theorem 3.2]: there the NSP is
applied twice, each time to
(X Y )
|r
2
. By assumption (X Y ) V
0
and the subspace
NSP handles this case.
3.1.5 Tools from representation theory
In this section we simplify the k-th moments of the random variables UAU
and |ψ ihψ |,
where U and |ψ i are drawn independently from the Haar measures. These derivations will
be used when we bound the moments of Tr[U A U
M(|ψ ihψ |)] for maps M L(C
n
).
In order to compute expectation values over the unitary group of the type
E
:
= E
UHaar
[(UAU
)
k
] (49)
we will use some basic facts from representation theory. Specifically, we use the decompo-
sition from Schur Weyl duality
(C
n
)
k
=
M
λ
π
k
λ
ρ
n
λ
(50)
into irreducible representations (irreps) ρ
n
λ
and π
k
λ
of the unitary group U(n) and the
symmetric group S
k
, respectively. The irreps are labelled by Young diagrams λ with k
boxes and at most n rows. Since [U
k
, E] = 0 for all U U(d), a famous theorem due to
Schur (see, e.g., [78, Theorem 4.2.10]) implies that E can be written as E =
L
λ
1 Y
λ
,
where Y
λ
ρ
n
λ
. But we also have that [σ, E] = 0 for all σ S
k
and Schur’s Lemma implies
that
E
UHaar
[(UAU
)
k
] =
X
λ
a
λ
P
λ
, (51)
where each a
λ
C and each P
λ
is the projection onto π
k
λ
ρ
n
λ
. The coefficients can be
calculated as the expansion coefficients
a
λ
=
Tr[A
k
P
λ
]
Tr[P
λ
]
. (52)
This argument also yields that
F
:
= E
|ψ i∼Haar
[|ψ ihψ |
k
] =
X
λ
b
λ
P
λ
. (53)
The coefficients are b
λ
Tr[P
λ
F ] and we have |ψ ihψ |
k
P
Sym
k
= |ψ ihψ |
k
and P
λ
P
λ
0
= 0
for λ 6= λ
0
. Together with Tr[F ] = 1 we obtain that
E
|ψ i∼Haar
[|ψ ihψ |
k
] =
1
Tr[P
Sym
k
]
P
Sym
k
, (54)
where P
Sym
k
is the projector onto the fully symmetric subspace in (C
n
)
k
.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 26
First moments : Setting k = 1 and using that Tr[A
2
] = kAk
2
2
and |ψ ihψ |
2
= |ψ ihψ |
yields that
E
UHaar
[UAU
] =
Tr[A]
n
1. (55)
and
E
|ψ i∼Haar
[(|ψ ihψ |)
2
] =
1
n
1 . (56)
Second moments : The flip operator F L(C
n
C
n
) is given by the linear extension of
F |ψ i |φ i
:
= |φi |ψ i . (57)
The projector onto the symmetric subspace of C
n
C
n
can be written as P
Sym
2
=
1
2
(1 + F).
The dimension of the fully symmetric subspace in C
n
C
n
is
Tr[P
Sym
2
] = Tr[1]/2 + Tr[F]/2 = n
2
/2 + n/2
= n(n + 1)/2 .
(58)
Together we obtain
E
h
(|ψihψ|)
2
i
=
2
n(n + 1)
P
Sym
2
=
1
n(n + 1)
(1 + F) . (59)
In order to derive the second moment of UAU
we also need the projector onto the
anti-symmetric subspace P
2
=
1
2
(1 F). From Eq. (51) and Tr[1] = n
2
and Tr[F] = n
we obtain
E =
2 Tr
P
Sym
2
A
n(n + 1)
P
Sym
2
+
2 Tr
P
2
A
)
n(n 1)
P
2
. (60)
Evaluating the remaining traces in a basis or using tensor network diagrams yields
E
UAU
2
=
Tr(A)
2
+ kAk
2
2
n(n + 1)
P
Sym
2
+
Tr(A)
2
kAk
2
2
n(n 1)
P
2
. (61)
3.2 Our bound on the minimum conic singular value
The following theorem is the main technical result of this work.
Theorem 12 (Minimum conic singular value bound). Let (A
i
, |ψ
i
i)
i[m]
be a normalized
4-generic measurement ensemble (Definition 1). Denote by A : HT(C
n
) R
m
the linear
map given by the components
A(M)
i
:
= Tr
A
i
M(|ψ
i
ihψ
i
|)
. (62)
Moreover, for c
µ
> 0 denote by
K
:
=
n
M HT(C
n
)
0
: kJ(M)k
1
c
µ
r kJ(M)k
2
o
, (63)
the cone of trace-annihilating maps of “effective Kraus-rank” at most r. Then, for the
constant c from Lemma 14, for any c
λ
]0, c[, q 1 and any
m > m
0
:
= 125 e ·
c
µ
c c
λ
2
r ln(n)n(n + 1) (64)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 27
the `
q
-minimum conic singular value of A is lower bounded as
inf
MK
kA(M)k
`
q
kMk
F
c c
λ
5
kAk
2
m
1
q
1
q
m
0
m
n(n + 1)
(65)
with probability at least 1 e
c
2
λ
m/5
over A.
Proof of Theorem 12. We choose E in Lemma 9 to be E
ν
from Lemma 13. Inserting the
bounds from Lemmas 14 and 13 into Eq. (45) implies
inf
ME
ν
m
1
2
1
q
kA(M)k
`
q
c
m ξ
1
(2ξ)
2
(n 1) n (n + 1)
2
2 kAk
2
2
ν
2
!
2
2
c
µ
p
6 e ln(n)r
n
ν kAk
2
ξλ
(66)
with probability at least 1 e
λ
2
/2
over A. Choosing ν = ν
0
:
= n
p
n(n + 1)/ kAk
2
and
using that (n + 1)(n 1)/n
2
1 yields
inf
ME
ν
0
m
1
2
1
q
kA(M)k
`
q
c
m ξ
1 2ξ
2
2
c
µ
q
6 e ln(n)n(n + 1) r ξλ . (67)
Next, choosing ξ = 1/
10 yields a maximum value greater than 1/5 for ξ
1 2ξ
2
2
. This
choice yields the bound
inf
ME
ν
0
m
1
2
1
q
kA(M)k
`
q
c
5
m c
µ
q
6 e ln(n)n(n + 1) r
λ
10
. (68)
Furthermore, we set λ =
c
λ
10
5
m with some constant 0 < c
λ
< c to obtain
inf
ME
ν
0
m
1
2
1
q
kA(M)k
`
q
c c
λ
5
m c
µ
q
6 e ln(n)n(n + 1) r . (69)
So, we choose m > m
0
with
m
0
:
=
5
6 e c
µ
c c
λ
q
ln(n)n(n + 1) r (70)
in order guarantee that the infimum yields a positive value. This choice leads to
inf
ME
ν
0
kA(M)k
`
q
c c
λ
5
m
1
q
1
r
m
0
m
. (71)
As the infimum over E
ν
is homogeneous in ν (i.e., “proportional” to all ν 0), we
obtain for the cone E =
S
ν0
E
ν
generated by E
ν
0
that
inf
ME
kA(M)k
`
q
kMk
F
= inf
ME
ν
0
kA(M)k
`
q
0
= inf
ME
ν
0
kA(M)k
`
q
kAk
2
n(n + 1)
c c
λ
5
kAk
2
m
1
q
1
q
m
0
m
n(n + 1)
,
(72)
where we have remembered the choice of ν
0
form the beginning of the proof. This bound
holds with probability at least 1 e
λ
2
/2
= 1 e
c
2
λ
m/5
over A for any c
λ
]0, c[.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 28
3.2.1 Upper bound on the mean empirical width W
m
We prove an upper bound on W
m
defined in Eq. (44) for E being a slice of the cone of
effectively rank-r maps.
Lemma 13 (Bound on W
m
). For r Z
+
, ν > 0, and c
µ
> 0 let
K
:
=
n
M HT(C
n
)
0
: kJ(M)k
1
c
µ
r kJ(M)k
2
o
(73)
and
E
ν
:
= {M K : kMk
F
= ν}. (74)
For m 2n
2
ln(n) let (A
i
, |ψ
i
i)
i[m]
be a normalized 4-generic measurement ensemble
(Definition 1). Then the mean empirical width (44) is bounded as
W
m
(E
ν
)
c
µ
p
6 e ln(n) r
n
ν kAk
2
. (75)
Proof. By definition (44), W
m
reads as
W
m
(E
ν
) = E sup
ME
ν
1
m
m
X
i=1
i
Tr[U
i
A U
i
M(|ψ
i
ihψ
i
|)]. (76)
Since W
m
(E
ν
) = νW
m
(E) with E
:
= E
1
, it is enough to prove the lemma for ν = 1.
With
H
i
:
= (U
i
A U
i
) (|ψ
i
ihψ
i
|
T
) L(C
n
C
n
) (77)
and using the identity (34), the “expectation value” can also be written as
Tr[U
i
A U
i
M(|ψ
i
ihψ
i
|)] = Tr[H
i
J(M)] . (78)
We define
H
:
=
1
m
m
X
i=1
i
H
i
(79)
to arrive at the compact form
W
m
(E) = E sup
ME
Tr[HJ(M)] . (80)
The application of Hölder’s inequality yields
W
m
(E) E sup
ME
kHk
kJ(M)k
1
. (81)
Using the definition (73) of K and that kJ(M)k
2
= kMk
F
= ν = 1 we obtain
W
m
(E) c
µ
r E kHk
. (82)
In order to bound E kHk
we proceed similarly as in the proof of Ref. [19, Propo-
sition 13]. By applying a non-commutative Khintchine inequality (see Theorem 29 in
Appendix B) to (79) we obtain
E
i
[kHk
]
s
2 ln(2n
2
)
m
E
m
X
i=1
H
2
i
!
1/2
. (83)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 29
Thanks to Jensen’s inequality, E
X
E
q
kXk
q
E kXk
and, hence,
E
i
[kHk
]
s
2 ln(2n
2
)
m
E
m
X
i=1
H
2
i
!
1/2
. (84)
The Matrix Chernoff Bound [79, Theorem 5.1.1] implies that
E
m
X
i=1
H
2
i
e
θ
1
θ
m
E
H
2
1
+ sup
H
1
kH
1
k
ln(n
2
)
θ
, (85)
where we have already used that {H
i
} are i.i.d. operators. With the averages (55) and (56)
we find that H
1
(defined in Eq. (77)) satisfies
E
H
2
1
=
Tr[A
2
]
n
1
1
n
1
=
kAk
2
2
n
2
. (86)
Moreover, H
1
always satisfies
kH
1
k
= kAk
|ψ ihψ |
T
= 1 , (87)
as A and |ψ i are both normalized. Hence,
E
m
X
i=1
H
2
i
e
θ
1
θ
m
kAk
2
2
n
2
+
ln(n
2
)
θ
. (88)
Choosing θ = 1 and proceeding from Eq. (84) we obtain
E[kHk
]
s
2 ln(2n
2
)
m
(e 1) m
kAk
2
2
n
2
+ 2 ln(n)
!
1/2
. (89)
With kAk
2
kAk
= 1 and the assumptions m 2n
2
ln(n) and ln
2n
3
2
ln(n) we
obtain
E[kHk
]
q
4 ln
2 n
kAk
2
n
(e 1) +
2n
2
ln(n)
m
!
1/2
p
6 e ln(n)
n
kAk
2
.
(90)
Finally, with Eq. (82),
W
m
(E)
c
µ
p
6 e ln(n) r
n
kAk
2
, (91)
which proves Lemma 13.
3.2.2 Lower bound on the marginal tail function Q
ξ
In this section, we prove a lower bound on the marginal tail function (43).
Lemma 14 (Lower tail bound). Let (A
i
, |ψ
i
i)
i[m]
be a normalized 4-generic measurement
ensemble (Definition 1). For some trace annihilating M TP(C
n
)
0
define the random
variable
S
:
= Tr[UA U
M(|ψ ihψ |)] . (92)
Then S satisfies
P
|S| ξ
c
1
ξ
2
(n 1) n (n + 1)
2
2 kAk
2
2
kM(1)k
2
2
+ kMk
2
F
2
(93)
for all ξ > 0, where c is an absolute constant.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 30
We will prove this lemma using the Paley-Zygmund inequality. This inequality states
that for every non-negative random variable Z 0 and parameter θ [0, 1]
P
Z > θ E[Z]
(1 θ)
2
E[Z]
2
E
Z
2
. (94)
We will choose Z = |S|
2
so that we can use a lower bound on the second moment and an
upper bound on the fourth moment of S.
Proof of Lemma 14. As typically done [62], we use the Paley-Zygmund inequality (94) to
establish the lower tail bound,
P
|S| ξ
= P
|S|
2
ξ
2
1
ξ
2
E[|S|
2
]
!
2
E[|S|
2
]
2
E[|S|
4
]
.
(95)
Inserting the fourth moment bound (104) and the second moment (97) into (95) we obtain
for another absolute constant c > 0 that
P
|S| ξ
c
1
ξ
2
E[|S|
2
]
!
2
(n 1)
2
n
2
(n + 1)
2
(n + 2)(n + 3)
(n 1)
2
n
2
(n + 1)
4
c
1
ξ
2
(n 1) n (n + 1)
2
2 kAk
2
2
kM(1)k
2
2
+ kMk
2
F
2
,
(96)
where (97) has been used again in the second step. This bound proves Lemma 14.
Actually, we can fully calculate the second moment without any assumptions on Tr[A].
Lemma 15 (Second moment). Let (A
i
, |ψ
i
i)
i[m]
be a 4-generic measurement ensemble
(Definition 1) and M HT(C
n
)
0
be a trace annihilating map.
Then S
:
= Tr[UA U
M(|ψ ihψ |)] has the second moment
E
|S|
2
=
2 kAk
2
2
n 2 Tr[A]
2
(n 1) n
2
(n + 1)
2
kM(1)k
2
2
+ kMk
2
F
. (97)
Proof. First, we will derive some identities for certain traces containing 1, F, and M. As
M is trace annihilating, i.e., M
(1) = 0, we obtain
Tr[1M
1,1
(1)] = Tr[M
1,1
(1)] = 0 (98)
and
Tr[1M
1,1
(F)] = Tr[M
1,1
(1)F] = 0 . (99)
Moreover, using “the swap-trick” Tr[F(A B)] = Tr[AB] we obtain
Tr[FM
1,1
(1)] = Tr
F
M(1) M
?
(1)

= Tr[M(1)M
?
(1)]
= Tr[M(1)M(1)
]
= kM(1)k
2
2
.
(100)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 31
|ψ iVector C
n
3 |ψ i =
A
Operator L(C
n
) 3 A =
=
A
AM
M L(C
n
) applied to A: M(A) =
Flip operator L(C
n
C
n
) 3 F =
M
M
?
Product of maps M M
=
Figure 8: Tensor network diagrams: tensors are denoted by boxes with one line for each index. Con-
traction of two indices corresponds to connection of the corresponding lines. Examples: A vector |ψ i,
vectorization of an operator A, M L(C
n
) applied to that vectorization, the non-vectorized version
of the flip operator F, and M
1,1
= M M
?
.
The last of these identities is
Tr[FM
1,1
(F)] = Tr[F(M M
)(F)]
=
M
M
?
= Tr[J(M )J(M
?
)]
= Tr[J(M )J(M)
]
= kJ(M)k
2
2
= kMk
2
F
,
(101)
see Figure 8 for an explanation of the tensor network diagram.
Next, using the expressions for the second moments of |ψ ihψ | and UAU
, (59) and
(61), we obtain
E[|S|
2
] = Tr

UA U
M(|ψ ihψ |)
UA U
M
?
(|ψ ihψ |)

= Tr
h
E
(UAU
)
2
M
1,1
E
|ψ ihψ |
2
i
=
1
2n
2
(n + 1)
X
±
Tr(A)
2
± Tr(A
2
)
n ± 1
Tr
h
(1 ± F)M
1,1
(1 + F)
i
=
1
2n
2
(n + 1)
X
±
±Tr(A)
2
+ Tr(A
2
)
n ± 1
Tr
h
FM
1,1
(1 + F)
i
.
(102)
We finish the proof by using the traces containing M from the beginning and that Tr[A
2
] =
kAk
2
2
,
E[|S|
2
] =
1
n
2
(n + 1)
Tr[A]
2
+ Tr[A
2
]
n + 1
+
Tr[A]
2
+ Tr[A
2
]
n 1

kM(1)k
2
2
+ kMk
2
F
=
2 kAk
2
2
n 2 Tr[A]
2
(n 1) n
2
(n + 1)
2
kM(1)k
2
2
+ kMk
2
F
.
(103)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 32
Lemma 16 (Upper bound on the fourth moment). The random variable S from Lemma 14
has a fourth moment bounded as
E[|S|
4
] c
3
kAk
4
2
kM(1)k
2
2
+ kMk
2
F
2
(n 1)
2
n
2
(n + 1)
2
(n + 2)(n + 3)
, (104)
where c
3
is an absolute constant.
The proof of this lemma uses facts about the symmetric group S
4
all of which are stated
in Appendix C.
Proof. The fourth moment is
E[|S|
4
] = Tr

UAU
M(|ψ ihψ |)
2
UAU
M
?
(|ψ ihψ |)
2
= Tr
h
E
(UA U
)
4
M
2,2
E
|ψ ihψ |
4
i
.
(105)
According to Eq. (54) the average over |ψ i yields
E
|ψ ihψ |
4
=
1
d
1
(n)
P
Sym
4
, (106)
where we have used that Tr(P
Sym
4
) = d
1
(n) with d
1
(n) = n (n + 1)(n + 2)(n + 3)/24 from
Eq. (172) being the dimension Tr(P
Sym
4
) corresponding to the trivial representation Sym
4
.
According to Eq. (51) we have
E
(UA U
)
4
=
5
X
i=1
a
i
P
i
. (107)
By taking the trace, we obtain the corresponding expansion coefficients
a
i
=
Tr[A
4
P
i
]
d
i
(n)
(108)
where d
i
(n) = Tr[P
i
] is also given in (172) and each P
i
is the representation of the central
minimal projection p
i
of S
4
(see (169)) on (C
n
)
4
. Inserting everything into (105), we
obtain
E[|S|
4
] =
5
X
i=1
a
i
Tr
h
P
i
M
2,2
(P
1
)
i
d
1
(n)
. (109)
For a permutation σ S
4
with representation R(σ) L((C
n
)
4
) the Hilbert-Schmidt
overlap Tr[A
4
R(σ)] only depends on the conjugacy class of σ. We denote the conju-
gacy class containing permutations composed of each j
i
disjoint cycles of sizes {k
i
}
i[l]
by k
j
1
1
k
j
2
2
. . . k
j
l
l
, e.g., 2
1
S
4
are the transpositions. One can conclude, e.g., from tensor
network diagrams that
Tr[1 A
4
] = Tr[A]
4
= 0 ,
Tr[2
2
A
4
] = Tr[A
2
]
2
,
Tr[2
1
A
4
] = Tr[A
2
] Tr[A]
2
= 0 ,
Tr[4
1
A
4
] = Tr[A
4
] ,
Tr[3
1
A
4
] = Tr[A
3
] Tr[A] = 0 .
(110)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 33
With the sizes of the conjugacy classes (166), the minimal projections (169), and the
dimensional factors (172)
a
1
=
3 Tr[A
2
]
2
+ 6 Tr[A
4
]
n (n + 1)(n + 2)(n + 3)
,
a
2
=
3 Tr[A
2
]
2
6 Tr[A
4
]
(n 3)(n 2)(n 1) n
,
a
3
=
6 Tr[A
2
]
2
(n 1) n
2
(n + 1)
,
a
4
=
3 Tr[A
2
]
2
6 Tr[A
4
]
(n 1) n (n + 1)(n + 2)
,
a
5
=
3 Tr[A
2
]
2
+ 6 Tr[A
4
]
(n 2)(n 1) n (n + 1)
,
(111)
for n 4. For n = 3 we have P
2
= 0 and, hence a
2
= 0. For n = 2 we have P
2
= 0 and
P
5
= 0 and, hence, a
2
= a
5
= 0. In both cases, the remaining a
i
are as stated above.
From the submultiplicativity of the Schatten 2-norm follows that
A
j
2
kAk
j
2
for
j Z
+
. Also using the bound
1
n j
j + 1
n
for integers n > j 0 (112)
we obtain
|a
i
|
c
1
kAk
4
2
(n 1)
2
n (n + 1)
(113)
for all i [5], where c
1
is an absolute constant.
In order to bound Tr[P
i
M
2,2
(P
1
)] in Eq. (109) we observe that each projection P
i
is a linear combination of permutation matrices {R(σ)}
σS
4
, see (169), where only the
permutation matrices R(σ) depend on n. Hence, it is enough to bound Tr[R(σ)M
2,2
(R(τ))]
for all permutations σ, τ S
4
.
Combining (104) from Lemma 17 below, (113), and (109) yields
E[|S|
4
] c
3
kAk
4
2
kM(1)k
2
2
+ kMk
2
F
2
(n 1)
2
n
2
(n + 1)
2
(n + 2)(n + 3)
(114)
for some absolute constant c
3
.
Lemma 17. For any M HT(C
n
)
0
and σ, τ S
4
Tr[R(σ)M
2,2
(R(τ))]
c
2
kM(1)k
2
2
+ kMk
2
F
2
(115)
for some absolute constant c
2
.
Proof. We will conclude from Proposition 18 that
Tr[R(σ)M
2,2
(R(τ))]
c
0
2
max
kM(1)k
4
2
, kMk
4
F
o
(116)
for some absolute constant c
0
2
, which implies the lemma.
We consider HT(C
n
)
0
L(L(X) L(Y)) as a subspace, where X = C
n
= Y are
labels for the input and output Hilbert space. The permutation operator R(τ) permutes
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 34
and contracts the indices of M
2,2
in Tr[R(σ)M
2,2
(R(τ))] that correspond to X. Similarly,
the permutation operator R(σ) permutes the indices of M
2,2
in Tr[R(σ)M
2,2
(R(τ))] that
correspond to Y, which are contracted subsequently by Tr. So, no X-index is contracted
with a Y-index. Hence, Tr[R(σ)M
2,2
(R(τ))] is a contraction without self-contractions
of the tensors {M, M
?
, M(1), M
?
(1), Tr M, Tr M
?
}. The tensors Tr M
?
and Tr M
vanish due to maps in HT(C
n
)
0
being trace annihilating. Moreover, kM k
F
= kM
?
k
F
and kM(1)k
2
= kM
?
(1)k
2
. Proposition 18 tells us that arbitrary closed tensor networks
without self-contractions are bounded by the product of the Frobenius norms of the single
tensors and implies the bound (116).
3.3 General tensor network bound
In this section, we prove a simple bound on a fully contracted tensor network, which we
used to prove Lemma 17. A tensor is a vector in C
n
1
C
n
2
··· C
n
K
, where X Y
denotes the tensor product of vector spaces X and Y. It is often helpful to identify a tensor
with its representation t in terms of a product basis of the canonical bases of C
n
i
. Then t
is given as an array of numbers, i.e., t C
n
1
×n
2
×···×n
K
.
A tensor network is a set of tensors together with a contraction corresponding to pairs
of indices where both indices have the same dimension n
i
. Instances of such tensor networks
are, e.g., the workhorse of powerful simulation techniques for strongly correlated quantum
systems [80]. We present a general version of such tensor networks and then prove our
bound. For this purpose, we use a notation that is completely disjoint from the other
sections. E.g., C will be a contraction instead of a constant.
A pointer to an index of a tensor in C
n
1
C
n
2
··· C
n
K
is just a number k [K]
used to identify the k-th index. A contraction is a linear map
C : C
n
1
C
n
2
··· C
n
K
C
n
i
1
C
n
i
2
··· C
n
I
given by a set of pairs of pointers P = ({k
l
, k
0
l
})
l[L]
with even K I = 2L so that each
number k [K] occurs at most once in at most one of the pairs. In particular, k
l
6= k
0
l
for
all l [L]. Moreover, the consistency condition n
k
l
= n
k
0
l
is required to hold for all l [L]
(in many relevant cases the dimensions n
k
assume only a very few different values). Let
P = {k
l
, k
0
l
}
l[L]
be the set of pointers occurring in the pairs P and I
:
= [K] \P the other
pointers. Then the contraction C(t) of a tensor t C
n
1
C
n
2
··· C
n
K
is given by the
components
C(t)
α
0
i
1
0
i
2
,...,α
0
i
I
:
=
X
α
k
[n
k
], k[K]
t
α
1
2
,...,α
K
Y
l∈P
δ
α
k
l
k
0
l
Y
m∈I
δ
α
m
0
m
, (117)
i.e., all index pairs {k
l
, k
0
l
} are summed over and the remaining indices are the indices of
C(t).
A tensor network is a set of tensors T = (t
j
)
jJ
with t
j
C
n
j
1
×n
j
2
×···×n
j
K
j
together with
a contraction C of their tensor product t
1
t
2
··· t
J
, where we use the convention
(t
1
t
2
··· t
J
)
i
1
,i
2
,...,i
K
= t
1
i
1
,i
2
,...,i
K
1
t
2
i
K
1
+1
,...,i
K
1
+K
2
. . . t
J
i
KK
J
+1
,...,i
K
(118)
with K
:
=
P
J
j=1
K
j
. For short, we write C(T )
:
= C(t
1
t
2
··· t
J
). We say that a
tensor network (T, C) with tensors T = (t
j
)
jJ
has a self-contraction if there is a tensor
t
j
such that both pointers of one of the pairs {k
l
, k
0
l
} defining C point to indices of t
j
. We
call a tensor network (T, C) closed if C(T ) is a number, i.e., if I is empty.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 35
The relevance of tensor networks comes from the fact that contractions of certain tensor
networks (such as matrix product states) can be calculated efficiently, while only to store
general tensors requires exponentially many parameters in the total number of indices. In
general, estimating the outcome of a contraction is a #P-hard problem [81]. However,
there is a simple and natural upper bound, which might already be known. At least for
sake of a self-contained presentation we provide a proof below.
Proposition 18 (Bound on tensor network contractions). Let (T, C) be a tensor network
with J 2 tensors T = (t
j
)
j[J]
and contraction C without self-contractions. Then
kC(T )k
F
J
Y
j=1
t
j
F
. (119)
If a tensor network has a self-contraction this statement can fail to hold. Such an
example can be easily constructed by taking the tensors so that the trace of an identity
matrix occurs.
Here we briefly sktech the proof of the tensor network bound and provide a full proof
in Appendix D.
Proof idea of Proposition 18. Reshaping t
j
suitably into matrices
˜
t
j
, the tensor network
C(T ) can be rewritten as a matrix product of
˜
t
j
1 sandwiched between vectorizations
t
1
and
t
J
E
of t
1
and t
J
,
C(T ) =
D
t
1
˜
t
2
1 . . .
˜
t
J1
1
t
J
E
. (120)
Hence,
|C(T )|
D
t
1
`
2
˜
t
2
1
. . .
˜
t
J1
1
t
J
E
`
2
=
t
1
F
˜
t
2
. . .
˜
t
J1
t
J
F
,
(121)
where we have used that kA Bk
= kAk
kBk
, k1k
= 1, and that the Frobenius
norm is the `
2
-norm of a vectorization. Then the bound
˜
t
j
˜
t
j
2
=
t
j
F
between
the Schatten norms finishes the proof.
3.4 Proofs of the reconstruction theorems
In this section we prove generalized versions of the theorems presented in Section 2 provid-
ing recovery guarantees for the constrained trace and diamond norm regularization (16)
and (17) and the CPT-fit (13).
3.4.1 Constrained trace norm minimization
In this section we prove a stable and robust reconstruction guarantee for:
T
c
η,q
= arg min{kJ(T )k
1
: T HT(C
n
), kA(T ) yk
`
q
η} (122)
with q 1. The reconstruction procedure T
c
η
introduced in (16) is a special case, namely
q = 2, of this class of minimization problems.
Theorem 19 (Stable and robust reconstruction from trace norm minimization). Let A :
CPT(C
n
) R
m
given by A(T )
j
= Tr[A
j
T ( |ψ
j
ihψ
j
|)], where (A
i
, |ψ
i
i)
i[m]
is a 4-generic
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 36
measurement ensemble (Definition 1) and the observables’ traceless part is
˜
A
0
:
= A
0
Tr[A
0
]1/n 6= 0.
Then there are constants C and λ such that the following holds. Fix q 1, some Kraus
rank r, a number of measurement settings
m C r ln(n)n
2
(123)
and let p [1, 2]. Then, with probability at least 1 e
λm
, for all T
0
CPT(C
n
) the
solution T
c
η,q
of the minimization (122) with y = A(T
0
) + e approximates T
0
with an error
J(T
0
T
c
η
)
p
4
r
11/p
kJ(T
0
¬r
)k
1
+ ˜c
n
2
r
1/p1/2
m
1
q
˜
A
2
η
kA
0
k
(124)
provided that kek
`
q
η. The constants C, λ, and ˜c only depend on each other.
Proof. Thanks to Observations 6 and 7 it is enough to prove the theorem of the case where
A
0
is traceless (i.e., A
0
=
˜
A
0
) and normalized to kA
0
k
= 1.
We prove the theorem by establishing the subspace NSP (47) for Choi matrices. The
main technical ingredient to prove the NSP is the bound on the minimum conic singular
value from Theorem 12. Then Theorem 11 will give the desired result.
We prove the subspace NSP with the subspace being J HT(C
n
)
0
, i.e., the Choi matrices
of hermiticity preserving and trace annihilating maps. It is helpful to consider two cases.
First, operators X J HT(C
n
)
0
satisfying
X
|r
2
µ
r
X
¬r
1
are effectively of high
rank and satisfy the NSP by default. Thus, it suffices to consider the case where
X
|r
2
µ
r
X
¬r
1
. Every such matrix satisfies
kXk
1
X
¬r
1
+
X
|r
1
1 + µ
µ
r
X
|r
2
1 + µ
µ
r kXk
2
.
(125)
Hence, M = J
1
(X) is contained in the cone K from Eq. (63) with c
µ
=
1+µ
µ
. For any
q 1, Theorem 12 yields the bound
inf
MK
kA(M)k
`
q
kMk
F
1
τ
(126)
with probability over A at least 1 e
λm
, where
1
˜
C kAk
2
m
1
q
n
2
(127)
with
˜
C being a constant only depending on λ and µ, and m m
0
with m
0
given in
Eq. (64) (with the dependence m
0
c
2
µ
). This establishes the (HT(C
n
)
0
, r, `
q
)-NSP for
our measurement map A with parameters µ and τ given as above. Hence, Theorem 11
yields
kJ(T
0
T
c
η,q
)k
p
(1 + µ)
2
(1 µ)r
11/p
J(T
c
η,q
)
1
kJ(T
0
)k
1
+ 2 kJ(T
0
¬r
)k
1
+ τr
1/p1/2
3 + µ
1 µ
A(T
0
T
c
η,q
)
`
q
(128)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 37
for any µ ]0, 1[ . The minimization in (122) assures
J(T
c
η,q
)
1
kJ(T
0
)k
1
by construc-
tion, because J(T
0
) is a feasible point of this optimization problem. Moreover,
A(T
0
T
c
η,q
)
`
q
kA(T
0
) yk
`
q
+
y A(T
c
η,q
)
`
q
2η (129)
since A(T
0
) y = e with kek
`
q
η and
y A(T
c
η
)
`
q
η due to a constraint in the
minimization (16). Choosing µ =
5 2 leads to (1 + µ)
2
/(1 µ) = 2. Simplifying the
bound (128) with these observations completes the proof.
3.4.2 CPT-fit
In this section, we prove a stable and robust reconstruction guarantee for the following
generalization of the CPT-fit (13):
T
`
q
:
= arg min{kA(T ) yk
`
q
: T CPT(C
n
)}, (130)
where q 1 arbitrary. Similar to before, the CPT-fit discussed in the introductory section
arises from fixing q = 2.
Theorem 20 (Stable and robust reconstruction from CPT-fit). Consider a 4-generic
measurement setup as in Theorem 19 and the same constants C and λ. Fix q 1, some
Kraus rank r, a number of measurement settings
m C r ln(n)n
2
(131)
and p [1, 2]. Then, with probability at least 1 e
λm
, for all T
0
CPT(C
n
) the solution
T
`
q
of (130) with y = A(T
0
) + e approximates T
0
with an error
J(T
0
T
`
q
)
p
4
r
11/p
kJ(T
0
¬r
)k
1
+ ˜c
n
2
r
1/p1/2
m
1
q
˜
A
0
2
kek
`
q
kA
0
k
. (132)
The constants C, λ, and ˜c again only depend on each other.
Before coming to the actual proof we provide some intuition why this theorem is almost
a corollary of Theorem 19. Adding the positivity constraint J(T ) 0 to the trace norm
minimization (122) from Theorem 19 can only improve the recovery of CPT maps. But as
kJ(T )k
1
= n for all T CPT(C
n
), the constrained minimization in (122) degenerates into
a feasibility problem. Hence, to achieve a further improvement in the reconstruction, it is
natural to minimize the residual kA(T ) yk
`
q
instead. This is exactly what the CPT-fit
does.
Proof of Theorem 20. This proof is analogous to the proof of Theorem 19 with two excep-
tions: First,
J(T
`
q
)
1
= kJ(T
0
)k
1
= n, because both are constrained to be Choi matrices
of CPT maps. Consequently, their difference vanishes and (128) becomes
J(T
`
q
)
p
(1 + µ)
2
(1 µ)r
11/p
· 2 kJ(T
0
¬r
)k
1
+ τr
1/p1/2
3 + µ
1 µ
A(T
0
T
c
η,q
)
`
q
. (133)
Secondly, the minimization (130) assures
A(T
0
T
`
q
η
)
`
q
A(T
`
q
η
) y
`
q
+ ky A(T
0
)k
`
q
2 kek
`
q
,
(134)
because T
0
is a feasible point satisfying ky A(T
0
)k
`
q
= kek
`
q
and T
`
q
is the minimizer.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 38
3.4.3 Constrained trace and diamond norm minimization
In Ref. [3] is shown that certain recovery guarantees for trace norm regularization (such as
(15) and (16)) automatically imply recovery guarantees for the analogous diamond norm
regularization (such as (17) and (18)). This holds when the recovery guarantees can be
proven as, e.g., in Ref. [62] via the descent cone of the reguralizer. In this section, we follow
this strategy in order to obtain a recovery guarantee for diamond norm regularization in
our quantum process tomography setting.
An error bound from the descent cone : For the proof of Theorem 19 we use an
error bound relying on the so-called descent cone of the function that is minimized in
the reconstruction. The descent cone of a function is the cone of directions in which the
function decreases [62]:
Definition 21 (Descent cone). Let V be an affine space and f : V R be a proper convex
function. The descent cone of f at a point x V is
D (f, x)
:
= cone {u V
0
: f(x + u) f(x)}. (135)
The following error bound is the basis for many recovery guarantees from bounds to the
conic singular value from Definition 8. It has been proven in Ref. [82] (where the descent
cone is given as a tangent cone) and has later been restated by Tropp [62] for optimizations
over a vector space. One can easily see from its proof that it also holds if one optimizes
over an affine space and chooses arbitrary `
q
-norms (q 1) to measure the size of the
minimum conic singular value.
Proposition 22 (Error bound for convex recovery, Tropp’s version [62, Proposition 2.6]).
Let x
0
V be a signal, A V R
m
be an affine linear measurement map, y = A(x
0
) + e
a vector of m measurements with additive error e R
m
. Fix q 1 and let x
f
η,q
be the
solution of the optimization
x
f
η,q
= arg min{f (x) | x V : kA(x) yk
`
q
η} (136)
for a convex function f : V R. If kek
`
q
η then
x
f
η
x
0
F
2η
λ
min
(A; D(f, x
0
), `
q
)
. (137)
where λ
min
(A; D(f, x
0
), `
q
) has been defined in (41).
Proof. In the proof of Ref. [62, Proposition 2.6] one uses Definition 8 of the conic singular
value only for element in VV. Generalizing the proof from errors measured in Euclidean
norm (q = 2) to any `
q
-norm is also straightforward, provided that the definition of the
minimum conic singular value is properly adjusted.
Our recovery guarantee for the diamond norm : First we prove a weaker version
of Theorem 19 with a different argument. In turn this reconstruction guarantee for mini-
mization (122) will imply a recovery guarantee for the following generalization of diamond
norm reconstruction:
T
c
η,q
= arg min{kT k
: T HT(C
n
), kA(T ) yk
`
q
η}. (138)
Note that the minimization (18) discussed in the main text is a special case of (138), where
q = 2.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 39
Theorem 23 (Stable reconstruction from trace norm minimization). Consider a mea-
surement setting as in Theorem 19 with possibly different constants C and λ. Fix some
Kraus rank r and
m C r ln(n)n
2
, (139)
as well as q 1. Then, with probability at least 1 e
λm
, for all T
0
HT(C
n
) with Kraus
rank rank(J(T
0
)) r the solution T
c
η,q
of the minimization (122) with y = A(T
0
) + e
approximates T
0
with an error
J(T
0
T
c
η
)
2
˜c
n
2
m
1
q
˜
A
0
2
η
kA
0
k
(140)
provided that kek
`
q
η. The constants C, λ, and ˜c again only on each other.
Proof. Thanks to Observations 6 and 7 it is enough to prove the theorem of the case
where A
0
is traceless (i.e., A
0
=
˜
A
0
) and normalized to kA
0
k
= 1. This proof relies
on Proposition 22 where the reconstruction error is bounded in terms of the `
q
-minimum
conic singular value of A w.r.t. the descent cone of the trace norm at T
0
.
From Ref. [19, Lemma 10] it follows that any T
0
HT(C
n
) with rank(J(T
0
)) r and
any M D(kJ( ·)k
1
, T
0
) the Hölder-type inequality
kJ(M)k
1
2
r kJ(M)k
2
(141)
is satisfied. Hence, we choose c
µ
in Theorem 12 to be c
µ
= 2. Then the theorem implies
that with probability at least 1 e
λm
the minimum conic singular value λ
min
of A w.r.t.
the descent cone D(kJ( ·)k
1
, T
0
) is bounded as
λ
min
(A; D(kJ( ·)k
1
, T
0
); q)
˜
C
kAk
2
m
1
q
n
2
q 1, (142)
where
˜
C is a constant only depending on λ. Proposition 22 finishes the proof.
Now, the results from Ref. [3], specifically [3, Implication 9] tell us that the diamond
norm minimization performs at least as well as the trace norm minimization.
Corollary 24 (Stable reconstruction from diamond norm minimization [3]). Choosing
the diamond norm minimization (18) instead of the trace norm minimization (16) in
Theorem 23 yields a smaller smaller reconstruction error,
J(T
0
T
c
η
)
2
J(T
0
T
c
η
)
2
. (143)
3.5 Reconstruction from approximate 4-generic measurements
As stated, our recovery guarantees hold if the input states are drawn from a complex 4-
design and the output states are measured with observables that have unitary 4-designs
as eigenbases. Here, we show that for the recovery guarantees to hold /n
4
-approximate
4-designs are enough, in the sense that the only changes the constants in the recovery
guarantee but not directly the reconstruction error. This significantly lesses the burden
for experimental realizations. Similar generalization have been proven for low-rank matrix
recovery [19] and we will extend those arguments here. -approximate designs can be
implemented, e.g., by local random quantum circuits [4, 6, 7] or fluctuating Hamiltonians
[8], where the can be reduced exponentially quickly.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 40
For some probability distribution µ on the sphere in C
n
we denote the corresponding
average of |ψ ihψ |
k
by
ψ
(k)
µ
:
= E
|ψ i∼µ
h
|ψ ihψ |
k
i
(144)
and set ψ
(k)
:
= ψ
(k)
Haar
to be the uniform average. We call µ and -approximate spherical
k-design if
ψ
(k)
µ
ψ
(k)
ψ
(k)
=
n+k1
k
. (145)
Such an -approximate k-design is also an -approximate k
0
-design for any k
0
k (see, e.g.,
[19, Lemma 16]).
Analogously, given a probability measure ν on U (n) we define the k-th moment (su-
per)operator by
G
(k)
ν
(X)
:
= E
Uν
h
U
k
XU
†⊗k
i
(146)
and set G
(k)
:
= G
(k)
Haar
. We define ν to be an -approximate unitary k-design if for all
traceless product operators X
G
(k)
ν
(X) G
(k)
(X)
G
(k)
(X)
.
(147)
This bound means that the traceless part of observables is not changed too much. By the
embedding X 7→ X 1
n
it is easy to see that an -approximate unitary k-design is also
an -approximate unitary k
0
-design for all k
0
k.
There are also other definitions of approximate designs, which are partially discussed,
e.g., in Low’s PhD thesis [83, Section 2.2]. However, for our purposes these definitions are
the most natural ones.
We note that for a traceless product operator X
G
(k)
(X)
c(k)
n
k
kXk
2
, (148)
which follows from Eq. (51), expanding P
λ
in terms of permutations σ, viewing Tr[σX]
as tensor network, and using Proposition 18. Hence, any -approximate unitary k-design
from one of the other definitions [83, Section 2.2] is also an
0
-approximate unitary k-design
according to our definition. As usual, there is some dimension factor overhead when going
from one definition to another.
Definition 25 (-approximate 4-generic measurement ensemble). We call a measurement
ensemble (A, |ψ i) with observable A acting on C
n
and state |ψ i in C
n
-approximate
4-generic if it fulfills the following criteria:
i) The distribution of |ψ i in C
n
is an -approximate spherical 4-design.
ii) A = UA
0
U
, where A
0
Herm(C
n
) is fixed and U in U(n) is an -approximate
unitary 4-design.
The measurement ensemble is called normalized -approximate 4-generic if the observables
are traceless and normalized in spectral norm, i.e. Tr[A
0
] = 0 and kA
0
k
= 1.
Theorem 26 (λ
min
for /n
4
-approximate 4-generic measurements). Theorem 12 also
holds for -approximate 4-generic measurements with a smaller constant c > 0 whenever
c
4
n
4
, where c
4
> 0 is an absolute constant.
As a direct consequence, all reconstruction guarantees proven in this work also hold for
/n
4
-approximate 4-generic measurements with different absolute constants.
The theorem is a consequence of the two following lemmas.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 41
Lemma 27 (Q
ξ
for /n
4
-approximate 4-generic measurements). Lemma 14 also holds for
-approximate 4-generic measurements with a smaller constant c > 0 whenever
c
4
n
4
,
where c
4
> 0 is an absolute constant.
Proof. Let µ and ν be an -approximate spherical and unitary k-design, respectively. We
will use that for any operators G, B L(C
d
) and superoperator M L(C
d
)
|Tr[GM(B)]| kGk
2
kMk
kBk
2
d kGk
kMk
kBk
. (149)
Moreover, we denote M
2k
:
= M
k,k
= M
k
M
∗⊗k
. Then, for any traceless and normalized
observable A Herm(C
n
) and l = 2k,
Tr
h
G
(l)
ν
(A
l
)M
l
(ψ
(l)
µ
)
i
Tr
h
G
(l)
(A
l
)M
l
(ψ
(l)
)
i
Tr
h
G
(l)
ν
(A
l
) G
(l)
(A
l
)
M
l
ψ
(l)
µ
ψ
(l)
i
+
Tr
h
G
(l)
(A
l
)M
l
ψ
(l)
µ
ψ
(l)
i
+
Tr
h
G
(l)
ν
(A
l
) G
(l)
(A
l
)
M
l
(ψ
(l)
)
i
n
l
G
(l)
ν
(A
l
) G
(l)
(A
l
)
kM
l
k
ψ
(l)
µ
ψ
(l)
+ n
l
G
(l)
(A
l
)
kM
l
k
ψ
(l)
µ
ψ
(l)
+ n
l
G
(l)
ν
(A
l
) G
(l)
(A
l
)
kM
l
k
ψ
(l)
c(l)
n
l
kAk
l
2
kMk
l
,
(150)
where c(l) is a constant only depending on l. We note that kM k
2
kM(1)k
2
2
+ kMk
2
F
and choose a small enough absolute constant c
4
> 0 and
c
4
n
4
(151)
so that the (l = 4)-th moment (104) and the second moment (97) (with traceless A) change
only by a constant. This proves the lemma.
Lemma 28 (W
m
for -approximate 4-generic measurements). Lemma 13 still holds for
-approximate 4-generic measurements with kAk
2
replaced by (1 + ) kAk
2
.
Proof. The only thing that needs slightly to be changed in the proof of Lemma 13 is the
bound (86) on
E[H
2
1
]
. Using the definitions above with k = 1, we have
ψ
(1)
µ
ψ
(1)
µ
ψ
(1)
+
ψ
(1)
(1 + )
ψ
(1)
(152)
and, similarly,
G
(1)
ν
(X)
(1 + )
G
(1)
(X)
. (153)
Hence,
E
µ,ν
[H
2
1
]
(1 + )
2
E[H
2
1
]
. (154)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 42
3.6 Bosonic and fermionic linear optical circuits
It is worth mentioning that the above results largely carry over to another important task
which is the tomography of bosonic and fermionic circuits. For bosonic systems, this refers
to the tomography of linear optical circuits, which play an important role in quantum
information processing, with such circuits becoming available for a large number of modes
with the advent of integrated optical circuits [39, 40]. For fermionic systems, this applies to
what is called fermionic linear optics [84]. Interestingly, the results laid out above readily
apply to both situations, once the objects of interest are appropriately identified.
For the bosonic setting, consider n modes associated with bosonic annihilation opera-
tors (b
1
, . . . , b
n
) and Hilbert space H
B
n
in second quantization. The correlation matrix of
a quantum state ρ S(H
B
n
) is defined as C Herm(C
n
) with C
j,k
= Tr(b
j
b
k
ρ). Mode
transformations that preserve the boson number are reflected by maps (b
1
, . . . , b
n
) 7→
(c
1
, . . . , c
n
), where
c
j
=
n
X
k=1
U
j,k
b
k
(155)
with U U(n). Such maps are usually referred to as passive transformations, or linear
optical transformations in the quantum optical context. Under such mode transformations,
correlation matrices transform as
C 7→ U
CU . (156)
Initial correlation matrices reflecting the quantum state can hence be taken as
C = U
|1ih1|U , (157)
reflecting the situation that mode labelled 1 is first prepared in a Gaussian state satisfying
Tr(b
1
b
1
ρ) = 1, while the others are kept in the vacuum Tr(b
j
b
j
ρ) = 0 for j = 2, . . . , n.
This is then conjugated by a Haar random unitary U U(n), giving formally rise to an
identical transformation as considered above, with |ψ
i
ihψ
i
| |ψ ihψ | are i.i.d. realizations
of |ψ ihψ | = U
|1ih1|U , with U U(n) drawn from the Haar measure. Such preparations
are optically readily feasible with present technology, as Gaussian states are particularly
accessible with common sources.
Random measurements can again be seen as i.i.d. realizations A
i
UAU
with U
U(n) being Haar random. Here A does not take the role of the observable itself, but reflect
natural homodyne measurements on the level of correlation matrices. Their expectation
values are obtained as Tr(AC) for correlation matrices C. So again, while the type of
measurement is different and the objects involved take an altered physical role, the map
realized is formally identical with
y = A(T ) + e R
m
(158)
with single expectation values
A(T )
i
= Tr[A
i
T ( |ψ
i
ihψ
i
|)] + e
i
, (159)
for T (|ψ
i
ihψ
i
|) := V
|ψ
i
ihψ
i
|V , with V U(n) reflecting the unknown linear process. In
this way, process tomography of the kind discussed here is applicable to the bosonic setting.
This seems particularly important with the advent of monolithic bosonic integrated optical
devices [40], as they are, e.g., employed in boson samplers.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 43
Fermionic linear circuits associated with fermionic annihilation operators (f
1
, . . . , f
n
)
have the same structure (on that level). Again, correlation matrices C Herm(C
n
) trans-
form as
C 7→ U
CU (160)
for U U(n), and the same preparations are feasible. Here |1ih1| reflects a fermionic
Gaussian state in which the first mode contains exactly a single fermion, while the other
n1 modes contain no fermion. The same type of measurement is therefore again possible.
4 Conclusion and outlook
We first conclude and then give an outlook towards potential future work.
4.1 Conclusion
We have proven that quantum processes can be reconstructed from an essentially opti-
mal number of expectation values without the requirement of ancillary quantum systems.
Moreover, by an extensive numerical analysis we have (i) demonstrated the practical feasi-
bility of our approach and (ii) that the reconstruction procedures also work for Pauli-type
measurement settings. The number of necessary expectation values scales as r n
2
ln(n),
where r is the anticipated Kraus rank of the channel. The reconstructions are stable against
measurement noise and robust against violations of the measured quantum channel having
the anticipated Kraus rank. In particular, no strict assumptions on the noise level or the
Kraus rank are required for a simple fitting procedure (CPT-fit) to be guaranteed to give
a good approximation of the measured quantum channel. In several physically feasible
and realistic setting, the prescriptions laid out give direct and concrete advice on how to
optimally perform quantum process tomography.
4.2 Outlook
In this outlook, we present a short outline of several aspects that seem to be interesting
starting points for future research.
Mixed input states : The first potential generalization concerns the use of pure quantum
states in the reconstruction procedure. Numerically, we have observed that our reconstruc-
tions work almost equally well when the input states to the channels are mixed. Finding
a recovery guarantee following this observation would be a step towards more practical
measurements.
The restricted isometry property (RIP) and perspectives for thresholding
methods : The RIP can be adapted to our setting. A measurement map A is said
to fulfill RIP for Kraus rank r if
(1 δ) kT k
F
kA(T )k
`
2
(1 + δ) kT k
F
(161)
for all quantum channels T CPT(C
n
) with Kraus rank at most r. The lower RIP bound
is as in our case typically enough to obtain recovery guarantees for optimization
procedures. But their computational cost is often not optimal. For instance, iterative
hard [85, 86] and soft [87] thresholding algorithms are faster in many instances. But here,
recovery guarantees are typically more difficult to prove and also required the upper RIP
bound. More recently, a non-covex algorithm with a global convergence guarantee has been
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 44
proposed for quantum state tomography [88]. Analyzing such algorithms in the process
tomography setting is an interesting endeavour for future research.
Random Clifford unitaries : The work [9] shows that random Clifford gates are very
close to being unitary 4-designs, and that they provide a precise characterization in terms
of irreducible representations of the Clifford group. Such tools might be helpful to fur-
ther relax the requirements on the measurement settings. Here, it would be interesting to
see if these new insights can be used in order to prove our recovery guarantees with the
input states of the channels being random stabilizer states and the bases of the observ-
ables random Clifford unitaries. This might be particular useful in order to achieve fault
tolerance.
Diamond norm : We have observed numerically (Figure 5) that the minimum value (the
diamond norm of the reconstructed channel) of the unconstrained diamond norm recon-
struction (17), [3] is one if the reconstruction error is small and decreases with increasing
reconstruction error. It would be interesting to also understand this observation analyti-
cally. In this way one can obtain some confidence about reconstructed quantum channels.
Of course, also other types of reconstruction certificates would be of great interest.
Frequencies and dependent measurements : In a typical experiment, one does not
only learn the expectation value of an observable A, but rather acquires statistics about the
POVM associated with its spectral decomposition. Indeed, it is straightforward to apply
our reconstruction procedures to observed frequencies of POVM measurements. However,
in this work we disregard this finer-grained information. We have made this decision to
avoid technical complications: The various POVM elements associated with any given
setting are clearly not independent. However our proof techniques work best for inde-
pendently drawn measurements and our natural measurements lead to independence in a
natural way.
There are now related theoretical recovery guarantees that can handle some form of depen-
dency e.g. Refs. [59, 8993]. Applying such techniques to process tomography remains
an interesting open problem. As a first step towards dependent measurements, it would be
interesting to numerically compare sample complexities in the cases of our natural mea-
surement setup and the corresponding frequency measurements, with a fair accounting for
statistical noise.
Sample complexity : For quantum state tomography, fundamental lower bounds on the
sample complexity have been established in Refs. [65] and [15]. They are valid for arbitrary
POVM measurements [65] and measuring Pauli observables [15], respectively. Moreover,
these works also determine the sample complexity associated with different compressed-
sensing based state tomography techniques. A comparison with the associated fundamental
lower bounds shows a close to optimal scaling (at least for low-rank states).
In contrast to state tomography, very little is known about the sample complexity
associated with process tomography. A straightforward adaptation of the results [65] and
[15] is hindered by: (i) Typical process tomography measurements such as the 4-generic
measurements considered here have neither a Pauli structure, nor can they be interpreted
as state POVMs in a strict sense [94]. This makes the task of determining the exact sample
complexity associated with a concrete tomographic procedure more difficult. (ii) Due to
the trace preservation condition, quantum channels are more restricted than quantum
states. Suitable packing nets a key ingredient in the derivation of the fundamental lower
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 45
bounds in Refs. [15, 65] must fulfill these additional requirements which makes their
construction considerably more challenging. Despite these obstacles, we believe that such
a generalization is timely and well-motivated.
5 Acknowledgements
We thank I. Roth and M. Horodecki for helpful discussions. MK has been funded by the
National Science Centre, Poland (Polonez 2015/19/P/ST2/03001) within the European
Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-
Curie grant agreement No 665778. RK and DG have received funding from the Deutsche
Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excel-
lence Strategy - Cluster of Excellence Matter and Light for Quantum Computing (ML4Q)
EXC 2004/1 - 390534769 and Grant ZUK 81, the ARO under contract W911NF-14-1-0098
(Quantum Characterization, Verification, and Validation), and Universities Australia and
DAAD’s Joint Research Co-operation Scheme (using funds provided by the German Fed-
eral Ministry of Education and Research). RK, DG, and JE have benefitted from the DFG
through SPP1798 CoSIP. Moreover, RK and JE have received funding from the Templeton
Foundation, the ERC (TAQ), the DFG (EI 519/9-1, EI 519/7-1, CRC 173, DAEDALUS),
and the MATH+ excellence cluster. This work has also received funding from the Eu-
ropean Union’s Horizon 2020 research and innovation programme under grant agreement
No. 817482 (PASQuanS).
Appendices
In this appendix, we provide some auxiliary statements in order to keep this work largely
self-contained. In Appendix A, we provide semidefinite programming formulations of the
reconstruction procedures (15), (16), (17), and (18). In Appendix B, we state a non-
commutative Khintchine inequality used in the proof of our bound on the conic minimum
singular value, more precisely in the proof of Lemma 13 with the bound on the mean
empirical width. Then, finally, we provide some facts about the symmetric group S
4
in
Appendix C. These facts are also used in the derivation of the bound on the conic minimum
singular value in order to bound the fourth moment of our measurements (Lemma 16).
A Semidefinite programs for trace and diamond norm reconstruction
Our reconstructions can be implemented as semidefinite programs (SDPs) [3], which can
practically be solved using standard software such as CVX [60, 61]. Let us consider the
reconstruction of a quantum channel mapping operators in L(X) to operators in L(Y).
The minimization (15) can be rewritten as the following SDP,
T
η
= arg min
T,X,Y
1
2
Tr[X] + Tr[Y ]
,
subject to
X J(T )
J(T )
Y
!
0 ,
X, Y Pos(Y X) ,
kA(T ) yk
F
η .
(162)
The reconstruction (16) is obtained by adding the constraint T
(1
Y
) = 1
X
into this SDP.
By only changing the objective function with the spectral norm of partial traces Tr
Y
over the output space Y of T : L(X) L(Y), we obtain the minimization (17) as the
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 46
following SDP [3, 48],
T
η
= arg min
T,X,Y
1
2
kTr
Y
[X]k
+ kTr
Y
[Y ]k
,
subject to
X J(T )
J(T )
Y
!
0 ,
X, Y Pos(Y X) ,
kA(T ) yk
F
η .
(163)
Again, the constrained minimization (18) is obtained by adding the constraint T
(1
Y
) =
1
X
into the SDP.
B A non-commutative Khintchine inequality
Theorem 29 ([95, Remark 5.27.2] with constants from [96, Exercise 8.6(d)] ). Let B
1
, . . . , B
m
be self adjoint N ×N matrices and
1
, . . . ,
m
{−1, 1} uniformly and independently drawn
signs (called a Rademacher sequence). Then
E
m
X
j=1
j
B
j
q
2 ln(2N)
n
X
j=1
B
2
j
1/2
. (164)
C Linear representation of the symmetric group S
4
In order to bound the fourth moment of the measurement map some facts about the rep-
resentation of the permutation group S
4
are helpful. In this section, they are summarized
and partially derived. Facts that we just state can, e.g., be found in the Wikis [97] and [98].
By k
j
1
1
k
j
2
2
. . . k
j
l
l
we denote the conjugacy class containing permutations composed of
each j
i
disjoint cycles of sizes {k
i
}
i[l]
, e.g., 2
2
S
4
are products of disjoint transpositions.
Corresponding to each conjugacy class there is one irrep. They are given by the Young
Frames (see e.g. [98])
F
1
:
= (trivial rep.)
F
2
:
= (sign rep.).
F
3
:
= (degree two irreducible rep.)
F
4
:
= (standard rep.)
F
5
:
=
(product of standard rep.
and sign rep.)
(165)
We will denote the character corresponding to F
i
by χ
i
: S
4
Z. The characters are
constant on the conjugacy classes. The sizes of the conjugacy classes and the characters
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 47
of S
4
are (see e.g. [97])
Cycle type 1
4
2
2
2
1
4
1
3
1
# elements 1 3 6 6 8
χ
1
1 1 1 1 1
χ
2
1 1 1 1 1
χ
3
2 2 0 0 1
χ
4
3 1 1 1 0
χ
5
3 1 1 1 0
(166)
The dimension of the representation F
i
(in the group algebra) will be denoted by d
i
(also
called degree [98]). These dimensions are given by d
i
= χ
i
([1]).
Expanding a character χ in the group algebra yields
χ =
1
|S
4
|
X
σS
4
χ(σ)hσ, ·i
=
1
|S
4
|
X
C
χ(C)hΣC, ·i,
(167)
where h·, ·i is the inner product of the group algebra,
P
C
denotes the sum over all
conjugacy classes C in S
4
, and ΣC
:
=
P
σC
σ the sum over the conjugacy class C. The
central minimal projections
p
i
=
d
i
|S
4
|
χ
i
(168)
(see, e.g., Ref. [99, Theorem III.7.2]) are hence given by
p
1
=
1
24
(id + Σ[2, 2] + Σ[2] + Σ[4] + Σ[3]) ,
p
2
=
1
24
(id + Σ[2, 2] Σ[2] Σ[4] + Σ[3]) ,
p
3
=
1
12
(2 id + 2 Σ[2, 2] Σ[3]) ,
p
4
=
1
8
(3 id Σ[2, 2] + Σ[2] Σ[4]) ,
p
5
=
1
8
(3 id Σ[2, 2] Σ[2] + Σ[4]) ,
(169)
where we have made the identification hσ, ·i
=
σ.
Now we consider the linear representation R : S
4
(C
n
)
4
, which is given by permut-
ing the for tensor factors, i.e., the representation of σ S
4
is a unitary R(σ) on (C
n
)
4
given by
R(σ) =
n
X
i
1
,i
2
,i
3
,i
4
=1
i
σ(1)
, i
σ(2)
, i
σ(3)
, i
σ(4)
E
hi
1
, i
2
, i
3
, i
4
| . (170)
Often we write σ instead of R(σ). This representation naturally extends to a representation
of the group algebra.
As
P
5
i=1
p
i
is a decomposition of the identity on the group algebra, we obtain the
decomposition
1
(C
n
)
4
=
5
X
i=1
P
i
(171)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 48
with P
i
= R(p
i
). The dimension of the representation F
i
is Tr[R(p
i
)] and is given by the
dimension d
i
(n) of the Schur functor corresponding to F
i
when applied to a vector space
of dimension n [98] times the degree χ
i
(id) of the irreducible representation, where
d
1
(n) = n (n + 1)(n + 2)(n + 3)/24 ,
d
2
(n) = (n 3)(n 2)(n 1) n/24 ,
d
3
(n) = (n 1) n
2
(n + 1)/12 ,
d
4
(n) = (n 1) n (n + 1)(n + 2)/8 ,
d
5
(n) = (n 2)(n 1) n (n + 1)/8 .
(172)
D Proof of the tensor network bound Proposition 18
We start with some preliminaries that are helpful for the proof. A contraction C of a tensors
with K indices is defined by pairs of pointers {(k
l
, k
0
l
)}
l[L]
. For any L [L] [K], we
define the partial contraction C
L
of a tensor t to be the one given by the subset of pointer
pairs {(k
l
, k
0
l
)}
l∈L
. For L
1
[K] and L
2
[K] with L
1
L
2
= we can naturally apply
C
L
2
to C
L
1
(t) and it holds that C
L
2
(C
L
1
(t)) = C
L
1
(C
L
2
(t)) and C(t) = C
[K]\L
(C
L
(t)).
We will use the following facts about matrices A C
n
1
×n
2
and B C
n
2
×n
3
with
dimensions n
1
, n
2
, n
3
1,
kABk
kAk
kBk
, (173)
kAk
kAk
2
= kAk
2
. (174)
For n
1
= n
3
= 1 one has kAk
= kAk
2
and similarly for B. We will also use the identity
kA Bk
= kAk
kBk
, (175)
which holds for arbitrary matrices A C
n
1
×n
2
and B C
n
3
×n
4
.
Any bipartition of the indices of a tensor t yields a class of unitarily equivalent matrices.
More specifically, a matricization of a tensor t C
n
1
×n
2
×···×n
K
is a matrix A of which the
matrix elements are given by
A
(i
σ(1)
,i
σ(2)
,...,i
σ(K
0
)
),(i
σ(K
0
+1)
,...,i
σ(K)
)
= t
i
1
,i
2
,...,i
K
(176)
for some permutation σ S
K
. Two such matricizations given by τ, σ S
K
and the
same K
0
[K] are unitarily equivalent if {σ(i)}
i[K
0
]
= {τ(i)}
i[K
0
]
, i.e., if σ and τ yield
the same bipartition of the pointers [K]. For a tensor t C
n
1
×n
2
×···×n
K
and a disjoint
bipartition L
˙
R = [K] of the pointer set [K] we denote by t
L,R
some matricization of t
that comes from this bipartition. For any such matricization t
L,R
holds that
kt
L,R
k
2
= ktk
F
. (177)
A vectorization of a tensor is a matricization that yields a vector, i.e., a matrix with one
column or row.
Proof of Proposition 18. It is enough to prove the proposition for the case where the tensor
network is closed, i.e., where C(T ) C. This is so because we can write kC(T )k
2
F
always
as a closed tensor network, where all previous tensors plus their complex conjugates occur.
A tensor and its conjugate have the same norm, which shows the reduction argument.
We denote the tensors of the tensor network by t
j
C
n
j
1
×n
j
2
×···×n
j
K
j
where j [J] and
the pointer pairs defining the contraction C by ({k
l
, k
0
l
})
l[K]
with K =
P
j[J]
K
j
.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 49
Let us consider first the case J = 2. As there are no self-contractions, we can relabel
the pointer pairs ({k
l
, k
0
l
})
l[K]
so that the k
l
belong to t
1
and the k
0
l
to t
2
. Hence, C(T ) is
an inner product C(T ) =
t
1
t
2
of matricizations of t
1
and t
2
into vectors
t
1
and
t
2
.
Therefore, the Cauchy-Schwarz inequality and the invariance of the Frobenius norm (177)
prove the proposition for J = 2.
Now we consider J 3. If there are contractions {k
l
, k
0
l
} where one of the indices
belongs to t
1
and one to t
J
, for each such l we introduce an auxiliary tensor s
l
as identity
matrix with components s
l
m
k
l
,m
k
0
l
:
= δ
m
k
l
,m
k
0
l
that is contracted with t
1
and t
2
and replaces
their contraction {k
l
, k
0
l
}. This corresponds to tensoring t
2
t
3
···t
J1
with an n
k
l
×n
k
0
l
identity matrix. Denote by s
l
1
, s
l
2
, . . . , s
l
M
the tensors that are introduced in this way.
We obtain
˜
C from C by, the this modification, so that
˜
C contracts the modified tensor
network
˜
T = (t
1
,
˜
T
0
, t
J
) with
˜
T
0
= S T
0
S
:
=
s
l
1
, s
l
2
, . . . , s
l
M
T
0
:
=
t
2
, t
3
, . . . , t
J1
(178)
where
˜
C is obtained from C by adding the contractions {k
l
, k
0
l
}
l=L+1,...,K+M
with s
l
1
, s
l
2
, . . . , s
l
M
to the previous ones. With this construction we have C(T ) =
˜
C(
˜
T ).
Next, we rename the pointers such that the k
l
of the first K
1
pairs ({k
l
, k
0
l
})
l[K
1
]
belong to t
1
. As there are no contractions between t
1
and t
J
, we rename the pointers so
that the k
0
l
of the last K
J
pairs ({k
l
, k
0
l
})
l[K]\[KK
J
]
belong to t
J
. The remaining pointers
pairs are
˜
K
0
:
= K
0
M,
K
0
:
= {K
1
+ 1, K
1
+ 2, . . . , K K
J
},
M
:
= {K + 1, K + 2, L + M},
(179)
which contain no further contractions between t
1
and t
J
. Now we have achieved that the
pointer pairs connected to t
1
are K
1
:
= [K
1
] and the ones of t
J
are K
J
:
= [K] \[K K
J
],
so that they are clearly disjoint. Hence, we can write
C(T ) =
˜
C
K
1
t
1
˜
C
K
J
˜
C
˜
K
0
(
˜
T
0
) t
J
. (180)
In fact
˜
C
K
J
˜
C
˜
K
0
(
˜
T
0
) t
J
7→
˜
C(T ) is the action of the functional given by
˜
C
K
1
t
1
·
,
which, in turn, is given by a vectorization
t
1
of t
1
. Hence, we can obtain
|C(T )|
t
1
F
˜
C
K
J
˜
C
˜
K
0
(
˜
T
0
) t
J
F
. (181)
Similarly, t
J
7→
˜
C
K
J
˜
C
˜
K
0
(
˜
T
0
) t
J
is a linear mapping, where we view
˜
C
˜
K
0
(
˜
T
0
) as a
linear map contracting the indices K
J
and having non-contracted indices K
1
. This yields
a vectorization
t
J
E
of t
J
an a map is represented by a matricization
˜
C
˜
K
0
(
˜
T
0
)
K
1
,K
J
of
˜
C
˜
K
0
(
˜
T
0
) so that
˜
C
K
J
˜
C
˜
K
0
(
˜
T
0
) t
J
F
=
˜
C
˜
K
0
(
˜
T
0
)
K
1
,K
J
t
J
E
˜
C
˜
K
0
(
˜
T
0
)
K
1
,K
J
t
J
F
(182)
Using Eq. (178) we arrive at
˜
C
˜
K
0
(
˜
T
0
)
K
1
,K
J
=
s
l
1
s
l
2
··· s
l
M
C
K
0
(T
0
)
K
1
,K
J
. (183)
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 50
By construction, each auxiliary tensor s
l
is an identity matrix with one index in K
1
and
one in K
J
and the corresponding matricization has unit spectral norm. Using Eq. (175),
we obtain
˜
C
˜
K
0
(
˜
T
0
)
K
1
,K
J
=
C
K
0
(T
0
)
K
0
1
,K
0
J
, (184)
where K
0
1
K
1
and K
0
J
K are those pointer pairs of K
1
and K
2
have no pointer to any
s
l
(there are no contractions between S and T
0
).
Iterating Lemma 30 and using the bound (174) and Eq. (177) we obtain that
C
K
0
(T
0
)
K
0
1
,K
0
J
t
2
F
t
3
F
. . .
t
J1
F
. (185)
This completes the proof.
We use the notation for matricizations introduced right before Proposition 18 to state
the following.
Lemma 30. Let t
1
C
n
1
×n
2
×···×n
K
1
and t
2
C
n
K
1
+1
×n
K
2
+2
×···×n
K
1
+K
2
be tensors with
index pointers K
1
:
= [K
1
] and K
2
:
= [K
1
+ K
2
] \ [K
1
], respectively. Further, let C
M
be a
partial contraction over indices M = {(k
l
, k
0
l
)}
l[L]
with M
1
:
= {k
l
}
l[L]
K
1
pointing to
indices of t
1
and M
2
:
= {k
0
l
}
l[L]
K
2
pointing to indices of t
2
. Let C
M
(T )
L,R
be a matri-
cization of C
M
(s, t) with row indices L and column indices R where L
˙
M
1
˙
M
2
˙
R =
K
1
˙
K
2
. Then
kC
M
(s, t)
L,R
k
t
1
L
1
,R
1
M
1
t
2
M
2
L
2
,R
2
, (186)
where L
1
= K
1
L, R
1
= K
1
R, L
2
= K
2
L, and R
2
= K
2
R are the row/column
indices of t
1
and t
2
, respectively.
Proof. We write the matricized partially contracted tensor network as a matrix product,
C
M
(s, t)
L,R
=
t
1
L
1
,R
1
M
1
id
L
2
id
R
1
t
2
M
2
L
2
,R
2
, (187)
where id
L
2
denotes the identity matrix with row indices given by L
2
and matching column
indices and similarly for id
R
1
; e.g., id
{3,5}
has the matrix components (id
{3,5}
)
(i
3
,i
5
),(i
0
3
,i
0
5
)
=
δ
i
3
,i
0
3
δ
i
5
,i
0
5
for i
j
, i
0
j
[n
j
]. Using Eqs. (173) and (175) finishes the proof.
References
[1] I. L. Chuang and M. A. Nielsen, Prescription for experimental determination of the
dynamics of a quantum black box, J. Mod. Opt. 44, 2455 (1997), quant-ph/9610001.
[2] M. Mohseni, A. T. Rezakhani, and D. A. Lidar, Quantum-process tomography:
Resource analysis of different strategies, Phys. Rev. A 77, 032322 (2008), quant-
ph/0702131.
[3] M. Kliesch, R. Kueng, J. Eisert, and D. Gross, Improving compressed sensing with
the diamond norm, IEEE Trans. Inf. Th. 62, 7445 (2016), arXiv:1511.01513.
[4] F. G. S. L. Brandao, A. W. Harrow, and M. Horodecki, Local random quantum
circuits are approximate polynomial-designs, Commun. Math. Phys. 346, 397 (2016),
arXiv:1208.0692.
[5] F. G. S. L. Brandão, A. W. Harrow, and M. Horodecki, Efficient quantum pseudo-
randomness, Phys. Rev. Lett. 116, 170502 (2016), arXiv:1605.00713.
[6] Y. Nakata, C. Hirche, M. Koashi, and A. Winter, Efficient quantum pseudorandom-
ness with nearly time-independent hamiltonian dynamics, Phys. Rev. X 7, 021006
(2017), arXiv:1609.07021.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 51
[7] A. Harrow and S. Mehraban, Approximate unitary t-designs by short random quantum
circuits using nearest-neighbor and long-range gates, arXiv:1809.06957.
[8] E. Onorati, O. Buerschaper, M. Kliesch, W. Brown, A. H. Werner, and J. Eisert,
Mixing properties of stochastic quantum Hamiltonians, Comm. Math. Phys. 355, 905
(2017), arXiv:1606.01914.
[9] H. Zhu, R. Kueng, M. Grassl, and D. Gross, The Clifford group fails gracefully to be
a unitary 4-design, arXiv:1609.08172.
[10] J. Helsen, J. J. Wallman, and S. Wehner, Representations of the multi-qubit clifford
group, J. Math. Phys. 59 (2018), 10.1063/1.4997688, arXiv:1609.08188.
[11] D. G. Cory, M. D. Price, W. Maas, E. Knill, R. Laflamme, W. H. Zurek, T. F. Havel,
and S. S. Somaroo, Experimental quantum error correction, Phys. Rev. Lett. 81, 2152
(1998), quant-ph/9802018.
[12] B. P. Lanyon, M. Barbieri, M. P. Almeida, T. Jennewein, T. C. Ralph, K. J. Resch,
G. J. Pryde, J. L. O’Brien, A. Gilchrist, and A. G. White, Simplifying quantum logic
using higher-dimensional Hilbert spaces, Nature Phys. 5, 134 (2009).
[13] T. Monz, K. Kim, W. Hänsel, M. Riebe, A. S. Villar, P. Schindler, M. Chwalla,
M. Hennrich, and R. Blatt, Realization of the quantum Toffoli gate with trapped ions,
Phys. Rev. Lett. 102, 040501 (2009), arXiv:0804.0082.
[14] A. Fedorov, L. Steffen, M. Baur, M. P. da Silva, and A. Wallraff, Implementation of
a Toffoli gate with superconducting circuits, Nature 481, 170 (2012), arXiv:1108.3966.
[15] S. T. Flammia, D. Gross, Y.-K. Liu, and J. Eisert, Quantum tomography via com-
pressed sensing: error bounds, sample complexity and efficient estimators, New J.
Phys. 14, 095022 (2012), arXiv:1205.2300.
[16] A. Shabani, R. L. Kosut, M. Mohseni, H. Rabitz, M. A. Broome, M. P. Almeida,
A. Fedrizzi, and A. G. White, Efficient measurement of quantum dynamics via com-
pressive sensing, Phys. Rev. Lett. 106, 100401 (2011), arXiv:0910.5498.
[17] C. H. Baldwin, A. Kalev, and I. H. Deutsch, Quantum process tomography of unitary
and near-unitary maps, Phys. Rev. A 90, 012110 (2014), arXiv:1404.2877.
[18] S. Kimmel and Y. K. Liu, Phase retrieval using unitary 2-designs, in International
Conference on Sampling Theory and Applications (SampTA) (2017) pp. 345–349,
arXiv:1510.08887.
[19] R. Kueng, H. Rauhut, and U. Terstiege, Low rank matrix recovery from rank one
measurements, Appl. Comp. Harm. Anal. 42, 88 (2017), arXiv:1410.6913.
[20] A. V. Rodionov, A. Veitia, R. Barends, J. Kelly, D. Sank, J. Wenner, J. M. Martinis,
R. L. Kosut, and A. N. Korotkov, Compressed sensing quantum process tomography
for superconducting quantum gates, Phys. Rev. B 90, 144504 (2014), arXiv:1407.0761.
[21] M. Kabanava, R. Kueng, H. Rauhut, and U. Terstiege, Stable low-rank matrix recovery
via null space properties, Information and Inference: A Journal of the IMA 5, 405
(2016), arXiv:1507.07184.
[22] R. Kueng and P. Jung, Robust nonnegative sparse recovery and the nullspace property
of 0/1 measurements, IEEE Trans. Inf. Theory 64, 689 (2018), arXiv:1603.07997.
[23] D. Gross, Y.-K. Liu, S. T. Flammia, S. Becker, and J. Eisert, Quantum state tomog-
raphy via compressed sensing, Phys. Rev. Lett. 105, 150401 (2010), arXiv:0909.3304.
[24] D. Gross, Recovering low-rank matrices from few coefficients in any basis, IEEE Trans.
Inf. Th. 57, 1548 (2011), arXiv:0910.1879.
[25] Y.-K. Liu, in Adv. Neural Inf. Process. Syst. 24, edited by J. Shawe-Taylor, R. S.
Zemel, P. L. Bartlett, F. Pereira, and K. Q. Weinberger (Curran Associates, Inc.,
2011) pp. 1638–1646, arXiv:1103.2816.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 52
[26] J. B. Altepeter, D. Branning, E. Jeffrey, T. C. Wei, P. G. Kwiat, R. T. Thew, J. L.
O’Brien, M. A. Nielsen, and A. G. White, Ancilla-assisted quantum process tomogra-
phy, Phys. Rev. Lett. 90, 193601 (2003), quant-ph/0303038.
[27] S. Kimmel, M. P. da Silva, C. A. Ryan, B. R. Johnson, and T. Ohki, Robust extraction
of tomographic information via randomized benchmarking, Phys. Rev. X 4, 011050
(2014), arXiv:1306.2348.
[28] I. Roth, R. Kueng, S. Kimmel, Y.-K. Liu, D. Gross, J. Eisert, and M. Kliesch, Re-
covering quantum gates from few average gate fidelities, Phys. Rev. Lett. 121, 170502
(2018), arXiv:1803.00572.
[29] E. J. Candes, J. Romberg, and T. Tao, Robust uncertainty principles: exact signal
reconstruction from highly incomplete frequency information, IEEE Trans. Inform.
Theor. 52, 489 (2006).
[30] D. L. Donoho, Compressed sensing, IEEE Trans. Inf. Th. 52, 1289 (2006).
[31] T. Yamamoto, M. Neeley, E. Lucero, R. C. Bialczak, J. Kelly, M. Lenander,
M. Mariantoni, A. D. O’Connell, D. Sank, H. Wang, M. Weides, J. Wenner, Y. Yin,
A. N. Cleland, and J. M. Martinis, Quantum process tomography of two-qubit
controlled-Z and controlled-NOT gates using superconducting phase qubits, Phys. Rev.
B 82, 184515 (2010), arXiv:1006.5084.
[32] D. Kim, Z. Shi, C. B. Simmons, D. R. Ward, J. R. Prance, T. S. Koh, J. K. Gamble,
D. E. Savage, M. G. Lagally, M. Friesen, S. N. Coppersmith, and M. A. Eriksson,
Quantum control and process tomography of a semiconductor quantum dot hybrid qubit,
Nature 511, 70 (2014), arXiv:1401.4416.
[33] B. E. Anderson, H. Sosa-Martinez, C. A. Riofrio, I. H. Deutsch, and P. S. Jessen,
Accurate and robust unitary transformation of a high-dimensional quantum system,
Phys. Rev. Lett. 114, 240401 (2015), arXiv:1410.3891.
[34] S. T. Merkel, C. A. Riofrío, S. T. Flammia, and I. H. Deutsch, Random unitary maps
for quantum state reconstruction, Phys. Rev. A 81, 032126 (2010), arXiv:0912.2101.
[35] R. Blatt and C. F. Roos, Quantum simulations with trapped ions, Nat. Phys. 8, 277
(2012).
[36] N. Timoney, V. Elman, S. Glaser, C. Weiss, M. Johanning, W. Neuhauser, and
C. Wunderlich, Error-resistant single-qubit gates with trapped ions, Phys. Rev. A 77,
052334 (2008), quant-ph/0612106.
[37] M. Ohliger, V. Nesme, and J. Eisert, Efficient and feasible state tomography of quan-
tum many-body systems, New J. Phys. 15, 015024 (2013), arXiv:1204.5735.
[38] I. Bloch, J. Dalibard, and S. Nascimbene, Quantum simulations with ultracold quan-
tum gases, Nat. Phys. 8, 267 (2012).
[39] P. Kok, W. J. Munro, K. Nemoto, T. C. Ralph, J. P. Dowling, and G. J. Milburn,
Linear optical quantum computing with photonic qubits, Rev. Mod. Phys. 79, 135
(2007), arXiv:quant-ph/0512071.
[40] J. Carolan, C. Harrold, C. Sparrow, E. Martín-López, N. J. Russell, J. W. Silverstone,
P. J. Shadbolt, N. Matsuda, M. Oguma, M. Itoh, G. D. Marshall, M. G. Thompson,
J. C. F. Matthews, T. Hashimoto, J. L. O’Brien, and A. Laing, Universal linear
optics, Science 349, 711 (2015), arXiv:1505.01182.
[41] N. J. Russell, L. Chakhmakhchyan, J. L. O’Brien, and A. Laing, Direct dialling of
Haar random unitary matrices, New J. Phys. 19, 033007 (2017), arXiv:1506.06220.
[42] F. G. S. L. Brandão, A. W. Harrow, and M. Horodecki, Local random quantum
circuits are approximate polynomial-designs, Comm. Math. Phys. 346, 397 (2016),
arXiv:1208.0692.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 53
[43] M.-D. Choi, Completely positive linear maps on complex matrices, Lin. Alg. App. 10,
285 (1975).
[44] A. Jamiolkowski, Linear transformations which preserve trace and positive semidefi-
niteness of operators, Rep. Math. Phys. 3, 275 (1972).
[45] A. Gilchrist, N. K. Langford, and M. A. Nielsen, Distance measures to compare real
and ideal quantum processes, Phys. Rev. A 71, 062310 (2005).
[46] J. Watrous, Semidefinite programs for completely bounded norms, Theory of Comput-
ing 5, 217 (2009), arXiv:0901.4709.
[47] A. Ben-Aroya and A. Ta-Shma, On the complexity of approximating the diamond
norm, arXiv:0902.3397.
[48] J. Watrous, Simpler semidefinite programs for completely bounded norms,
arXiv:1207.5726.
[49] P. Delsarte, J. Goethals, and J. Seidel, Spherical codes and designs, Geom. Dedicata
6, 363 (1977).
[50] J. M. Renes, R. Blume-Kohout, A. J. Scott, and C. M. Caves, Symmetric infor-
mationally complete quantum measurements, J. Math. Phys. 45, 2171 (2004), quant-
ph/0310075.
[51] A. Ambainis and J. Emerson, Quantum t-designs: t-wise independence in the quantum
world, in Computational Complexity, 2007. CCC ’07. Twenty-Second Annual IEEE
Conference on (2007) pp. 129–140, quant-ph/0701126.
[52] D. Gross, K. M. R. Audenaert, and J. Eisert, Evenly distributed unitaries: on the
structure of unitary designs, J. Math. Phys. 48, 052104 (2007), quant-ph/0611002.
[53] C. Dankert, R. Cleve, J. Emerson, and E. Livine, Exact and approximate unitary
2-designs and their application to fidelity estimation, Phys. Rev. A 80, 012304 (2009),
quant-ph/0606161.
[54] E. Candès and B. Recht, Exact matrix completion via convex optimization, Found.
Comput. Math. 9, 717 (2009), arXiv:0805.4471.
[55] B. Recht, M. Fazel, and P. A. Parrilo, Guaranteed minimum-rank solutions of lin-
ear matrix equations via nuclear norm minimization, SIAM Rev. 52, 471 (2010),
arXiv:0706.4138.
[56] E. J. Candès and Y. Plan, Tight oracle bounds for low-rank matrix recovery from
a minimal number of random measurements, IEEE Trans. Inform. Theory 57, 2342
(2011), arXiv:1001.0339.
[57] A. Y. Kitaev, A. Shen, and M. N. Vyalyi, Classical and quantum computation, Vol. 47
(American Mathematical Society, 2002).
[58] J. Watrous, CS 766 Theory of quantum information, Available online at https://cs.
uwaterloo.ca/~watrous/LectureNotes.html (2011).
[59] R. Kueng, Low rank matrix recovery from few orthonormal basis measurements, in
Sampling Theory and Applications (SampTA), 2015 International Conference on
(2015) pp. 402–406.
[60] M. Grant and S. Boyd, CVX: Matlab software for disciplined convex programming,
version 2.1, http://cvxr.com/cvx (2014).
[61] M. Grant and S. Boyd, in Recent advances in learning and control, Lecture Notes in
Control and Information Sciences, edited by V. Blondel, S. Boyd, and H. Kimura
(Springer-Verlag Limited, 2008) pp. 95–110, http://stanford.edu/~boyd/graph_
dcp.html.
[62] J. A. Tropp, Convex recovery of a structured signal from independent random linear
measurements, in Sampling Theory, a Renaissance, edited by E. G. Pfander (Springer,
2015) pp. 67–101, arXiv:1405.1102.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 54
[63] U. Michel, M. Kliesch, R. Kueng, and D. Gross, Note on the saturation of the norm
inequalities between diamond and nuclear norm, IEEE Trans. Inf. Theory 64, 7443
(2018), arXiv:1612.07931.
[64] A. Steffens, C. A. Riofrío, W. McCutcheon, I. Roth, B. A. Bell, A. McMillan, M. S.
Tame, J. G. Rarity, and J. Eisert, Experimentally exploring compressed sensing quan-
tum tomography, Quantum Sci. Technol. 2, 025005 (2017), arXiv:1611.01189.
[65] J. Haah, A. W. Harrow, Z. Ji, X. Wu, and N. Yu, Sample-optimal tomography of
quantum states, IEEE Trans Inf. Th. 63, 5628 (2017), arXiv:1508.01797.
[66] Y. Shi, Both Toffoli and controlled-NOT need little help to do universal quantum com-
putation, quant-ph/0205115.
[67] A. Barenco, C. H. Bennett, R. Cleve, D. P. DiVincenzo, N. Margolus, P. Shor,
T. Sleator, J. A. Smolin, and H. Weinfurter, Elementary gates for quantum com-
putation, Phys. Rev. A 52, 3457 (1995), quant-ph/9503016.
[68] M. A. Nielsen and I. L. Chuang, Quantum computation and quantum information
(Cambridge university press, 2010).
[69] A. Y. Kitaev, Quantum computations: algorithms and error correction, Russian Math.
Surv. 52, 1191 (1997).
[70] E. Knill, R. Laflamme, and W. H. Zurek, Resilient quantum computation: error
models and thresholds, Proc. R. Soc. A 454, 365 (1998), arXiv:quant-ph/9702058.
[71] D. Aharonov and M. Ben-Or, Fault-tolerant quantum computation with constant error
rate, in 29th ACM Symp. on Theory of Computing (STOC) (New York, 1997) pp.
176–188.
[72] S. T. Flammia and Y.-K. Liu, Direct fidelity estimation from few Pauli measurements,
Phys. Rev. Lett. 106, 230501 (2011), arXiv:1104.4695.
[73] J. Emerson, R. Alicki, and K. Życzkowski, Scalable noise estimation with random
unitary operators, J. Opt. B 7, S347 (2005), arXiv:quant-ph/0503243.
[74] J. J. Wallman and S. T. Flammia, Randomized benchmarking with confidence, New J.
Phys. 16, 103032 (2014), arXiv:1404.6025.
[75] R. Kueng, D. M. Long, A. C. Doherty, and S. T. Flammia, Comparing experiments to
the fault-tolerance threshold, Phys. Rev. Lett. 117, 170502 (2016), arXiv:1510.05653.
[76] S. Mendelson, Learning without concentration, J. ACM 62, 21:1 (2015),
arXiv:1401.0304.
[77] V. Koltchinskii and S. Mendelson, Bounding the smallest singular value of a random
matrix without concentration, International Mathematics Research Notices , rnv096
(2015), arXiv:1312.3580.
[78] R. Goodman and N. R. Wallach, Representations and invariants of the classical groups,
Vol. 68 (Cambridge University Press, 2000).
[79] J. A. Tropp, User-friendly tools for random matrices: An introduction, Tech. Rep.
(DTIC Document, 2012).
[80] U. Schollwöck, The density-matrix renormalization group in the age of matrix product
states, Ann. Phys. 326, 96 (2011), January 2011 Special Issue, arXiv:1008.3477.
[81] S. Gharibian, Z. Landau, S. W. Shin, and G. Wang, Tensor network non-zero testing,
Quant. Inf. Comp. 15, 885 (2015), arXiv:1406.5279.
[82] V. Chandrasekaran, B. Recht, P. Parrilo, and A. Willsky, The convex geometry of
linear inverse problems, Found. Comput. Math. 12, 805 (2012), arXiv:1012.0621.
[83] R. A. Low, Pseudo-randomness and learning in quantum computation, Ph.D. thesis,
University of Bristol (2010), arXiv:1006.5227.
[84] E. Knill, Fermionic linear optics and matchgates, quant-ph/0108033.
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 55
[85] J.-F. Cai, E. J. Candes, and Z. Shen, A singular value thresholding algorithm for
matrix completion, SIAM J. Opt. 20, 1956 (2010).
[86] S. Chatterjee, Matrix estimation by universal singular value thresholding, Ann. Statist.
43, 177 (2015), arXiv:1212.1247.
[87] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, Distributed optimization and
statistical learning via the alternating direction method of multipliers, Found. Trends
Mach. Learn. 3, 1 (2011).
[88] A. Kyrillidis, A. Kalev, D. Park, S. Bhojanapalli, C. Caramanis, and S. Sanghavi,
Provable compressed sensing quantum state tomography via non-convex methods, npj
Quantum Information 4, 36 (2018), arXiv:1711.02524.
[89] V. Voroninski, Quantum tomography from few full-rank observables, arXiv:1309.7669.
[90] A. Acharya, T. Kypraios, and M. Guta, Statistically efficient tomography of
low rank states with incomplete measurements, New J. Phys. 18, 043018 (2016),
arXiv:1510.03229.
[91] A. Acharya and M. Guta, Statistical analysis of compressive low rank tomography with
random measurements, J. Phys. A 50, 195301 (2017), arXiv:1609.03758.
[92] E. J. Candès, X. Li, and M. Soltanolkotabi, Phase retrieval from coded diffraction
patterns, Appl. Comput. Harmon. Anal. 39, 277 (2015), arXiv:1310.3240.
[93] D. Gross, F. Krahmer, and R. Kueng, Improved recovery guarantees for phase re-
trieval from coded diffraction patterns, Appl. Comput. Harmon. Anal. 41, 37 (2016),
arXiv:1402.6286.
[94] M. Ziman, Process positive-operator-valued measure: A mathematical framework for
the description of process tomography experiments, Phys. Rev. A 77, 062112 (2008),
arXiv:0802.3862.
[95] R. Vershynin, Introduction to the non-asymptotic analysis of random matrices, in
Compressed Sensing: Theory and Applications (Cambridge University Press, 2012)
pp. 210–268, arXiv:1011.3027.
[96] S. Foucart and H. Rauhut, A mathematical introduction to compressive sensing
(Springer, 2013).
[97] Symmetric group:s4, http://groupprops.subwiki.org/wiki/Symmetric_group:S4
(2016), accessed: 2016-08-17.
[98] Linear representation theory of symmetric group:S4, http://groupprops.subwiki.
org/wiki/Linear_representation_theory_of_symmetric_group:S4 (2016), ac-
cessed: 2016-08-17.
[99] B. Simon, Representations of finite and compact groups, 10 (Am. Math. Soc., 1996).
Accepted in Quantum 2019-07-16, click title to verify. Published under CC-BY 4.0. 56