On maximum-likelihood decoding with circuit-level errors
Leonid P. Pryadko
Department of Physics & Astronomy, University of California, Riverside, California 92521, USA
(Dated: July 15, 2020)
Error probability distribution associated with a given Clifford measurement circuit is described
exactly in terms of the circuit error-equivalence group, or the circuit subsystem code previously
introduced by Bacon, Flammia, Harrow, and Shi. This gives a prescription for maximum-likelihood
decoding with a given measurement circuit. Marginal distributions for subsets of circuit errors are
also analyzed; these generate a family of related asymmetric LDPC codes of varying degeneracy.
More generally, such a family is associated with any quantum code. Implications for decoding
highly-degenerate quantum codes are discussed.
I. INTRODUCTION
Quantum computation offers exponential algorithmic
speed-up for some classically hard problems. This
promise is conditional in a fundamental way upon the
use of quantum error correction (QEC). However, despite
an enormous progress achieved both in theory and ex-
periment during the quarter century since the invention
of QEC[1], universal repetitive QEC protecting against
both phase and amplitude errors has not yet been demon-
strated in any of the qubit systems constructed to date.
This illustrates the enormous difficulty of building quan-
tum computers (QCs) with sufficiently small error rates.
Given also the engineering difficulties with scaling up
the number of qubits[2], it is important that on the algo-
rithmic side one tries to achieve every optimization possi-
ble. Among other things, we would like to maximize the
probability of successful syndrome-based decoding with
QEC. Given the available hardware, this requires choos-
ing the optimal code, the optimal measurement circuit,
and the optimal decoder. In particular, a decoder should
be designed for the specific syndrome measurement cir-
cuit, as it must be aware of the associated correlations
between the errors[3–5]—at sufficiently high error proba-
bilities such correlations are present even in fault-tolerant
(FT) circuits designed to prevent a single fault from af-
fecting multiple data qubits.
In contrast, the standard approach is to use a de-
coder optimized for the underlying code, regardless of
the actual circuit used for syndrome measurement. In
some cases, e.g., in the case of surface codes with
minimum-weight perfect matching (MWPM) decoder,
some leading-order correlations can be included in the
edge weights of the graph used for decoding[4, 6–8]. How-
ever, such a scheme is limited to codes and measure-
ment circuits where MWPM can be done efficiently, e.g.,
surface codes with single-ancilla measurement circuits[9],
and certain other classes or topological codes[10, 11].
Further, there is necessarily a decoding accuracy loss
when measurement circuit is not simply repeated, e.g.,
with code deformation or lattice surgery[12].
A feasible approach to building a decoder optimized for
the specific measurement circuit is to train a neural net-
work (NN) using extensive simulation data[13–20]. How-
ever, as is commonly the case with NNs, there is always
a question whether training has been sufficient to achieve
optimal decoding performance.
A parameterization of the probability distribution of
correlated quantum errors in terms of a spin model has
been recently considered by Chubb and Flammia[5]. In
particular, they describe how such a formalism can be
used for maximum-likelihood (ML) decoding in the pres-
ence of circuit-level noise. However, Chubb and Flam-
mia focused on larger circuits composed of measurement
blocks. Errors are assumed uncorrelated between the
blocks, while a model of error correlations at the output
of each block has to be constructed offline, e.g., using
error propagation for a Clifford measurement circuit. In
particular, Chubb and Flammia stopped short of analyz-
ing circuit-level errors, and only considered numerically
a toy model of correlated errors.
The goal of this work is to give an explicit numeri-
cally efficient algorithm for analyzing error correlations
resulting from a given qubit-based Clifford measurement
circuit, and for constructing decoders optimized for such
a circuit. Main result is that such correlations can be
accounted for by using phenomenological error model
(no error correlations) with the circuit-associated subsys-
tem code constructed by Bacon, Flammia, Harrow, and
Shi[21, 22]. Thus, any generic decoder capable of deal-
ing with uncorrelated data and syndrome measurement
errors in highly-degenerate sparse-generator subsystem
codes can be rendered to account for circuit-level error
correlations. Error correlations needed for implementing
the scheme of Chubb and Flammia are recovered by cal-
culating the marginals of the constructed distribution,
which can be done in practice using Ising model star-
polygon transformations[23]. The construction naturally
includes additional correlations between errors in differ-
ent fault locations on the circuit.
An immediate application is for designing decoders
approaching true ML decoding for Clifford circuits, to
be used in quantum memory. In particular, more ac-
curate decoding and optimization, with the ability to
account for error rates’ variations on individual qubits,
could help improve QEC to the level sufficient to pass
the break-even point and show coherent lifetimes longer
than that of an unprotected qubit in present-day or near-
future devices. Related techniques could also be more
arXiv:1909.06732v4 [quant-ph] 31 Jul 2020
2
widely applicable for design and analysis of FT protocols
and control schemes. Examples include error correction
with FT gadgets like flag error correction[24–26], schemes
for protected evolution, e.g., using code deformations
where conventional approaches may result in a reduced
distance[12], and optimized single-shot error-correction
protocols[27–29].
In addition, analysis of marginal error distributions
and associated families of asymmetric codes of varying
degeneracy could become an important tool in the theory
of QECCs. In particular, it could help constructing bet-
ter decoders for highly-degenerate quantum LDPC codes.
Also, thorough analytical understanding of error correla-
tions could be useful for fundamental analysis of thresh-
olds, e.g., in order to extend and/or improve the bounds
in Refs. 30 and 31.
Paper outline: After this Introduction, an overview
of relevant notations and results in quantum codes and
multi-variable Bernoulli distributions is given in Sec. II.
Pauli error channels with correlated errors, including
circuit-level correlations, are discussed in Sec. III. Cor-
responding marginal distributions are constructed in
Sec. IV, followed by a discussion of possible implications
for exact and approximate ML decoding with circuit-level
errors in Sec. V. Sec. VI gives the concluding remarks.
II. BACKGROUND
A. Quantum codes
Generally, an n-qubit quantum code Q is a subspace
of the n-qubit Hilbert space H
n
2
. A quantum [[n, k, d]]
stabilizer code is a 2
k
-dimensional subspace specified as a
common +1 eigenspace of all operators in an Abelian sta-
bilizer group S P
n
, 11 6∈ S, where the n-qubit Pauli
group P
n
is generated by tensor products of single-qubit
Pauli operators. The stabilizer is typically specified in
terms of its generators, S = hS
1
, . . . , S
nk
i. The weight
of a Pauli operator is the number of qubits that it af-
fects. The distance d of a quantum code is the minimum
weight of a Pauli operator E P
n
which commutes with
all operators from the stabilizer S, but is not a part of
the stabilizer, E 6∈ S. Such operators act non-trivially in
the code and are called logical operators.
Subsystem codes[32, 33] are a generalization of sta-
bilizer codes where only some of the logical qubits are
used. More precisely, given a stabilizer group S, the sta-
bilized subspace Q
S
is further decomposed into a tensor
product of two subspaces, Q
S
= Q
L
Q
G
, where Q
L
is a 2
k
-dimensional “logical” subspace used to store the
quantum information, while we do not care about the
state of the “gauge” qubits in the subspace Q
G
after the
recovery. Logical operators of the original code which
act non-trivially only in Q
G
, together with the operators
from the stabilizer group S, generate the non-Abelian
gauge group G which fully characterizes the subsystem
code. In particular, the center Z(G) is formed by the
elements of the original stabilizer S, up to a phase i
m
,
m {0, 1, 2, 3}.
A Pauli error E that anticommutes with an element
of the stabilizer, ES = SE, S S, is called de-
tectable. Such an error results in a non-zero syndrome
s = {s
1
, . . . , s
r
} whose bits s
i
{0, 1} are obtained by
measuring the eigenvalues (1)
s
i
of the chosen set of sta-
bilizer generators S
i
, i = {1, . . . , r}, r n k. Unlike
in the case of classical codes, there may be many equiv-
alent (mutually degenerate) errors resulting in the same
syndrome. For a subsystem code, errors E
0
degenerate
with E, denoted E
0
' E, have the form E
0
= EG, where
G G is an element of the gauge group; such errors
can not and need not be distinguished. Non-trivial logi-
cal operators of the subsystem code are logical operators
of the original stabilizer code which act non-trivially in
Q
L
. A bare logical operator U acts trivially in Q
G
and
commutes with any element of G. Such restrictions do
not apply to dressed logical operators which are only re-
quired to commute with elements of the stabilizer S. In
any case, multiplication by a logical operator U gives an
error with the same syndrome but from a different equiv-
alence class, EU 6' E.
Analysis of error correction is conveniently done using
quaternary, or an equivalent binary, representation of the
Pauli group[34, 35]. A Pauli operator U i
m
X
u
Z
v
,
where u, v {0, 1}
n
and X
u
= X
u
1
1
X
u
2
2
. . . X
u
n
n
, Z
v
=
Z
v
1
1
Z
v
2
2
. . . Z
v
n
n
, is mapped, up to a phase, to a length-2n
binary vector e = (u|v). Two Pauli operators U
1
, U
2
commute iff the symplectic inner product
e
1
? e
T
2
e
1
Σ e
T
2
, Σ
I
n
I
n
, (1)
of the corresponding binary vectors is zero, e
1
? e
T
2
= 0.
Here I
n
is the n × n identity matrix. Thus, if H =
(H
Z
|H
X
) is an m × 2n binary matrix whose rows rep-
resent stabilizer generators, and
e
H H Σ = (H
X
|H
Z
),
then the syndrome s of an error U
e
with binary represen-
tation e = (u|v) is given by s
T
= He
T
. If we similarly
denote G = (G
X
|G
Z
) an r × 2n matrix formed by the
gauge group generators (for a stabilizer code, G =
e
H),
the errors with binary representation e and e
0
= e + αG
are equivalent, e
0
' e, for any length-r binary string
α F
m
2
. Generally, GH
T
= 0, and
rank H = n k κ, rank G = n k + κ, (2)
where κ is the number of gauge qubits. Matrices G and
H can be viewed as generator matrices of an auxiliary
length-2n CSS code which encodes 2k qubits. These cor-
respond to a basis set of 2k independent logical oper-
ators of the original subsystem code. It will be conve-
nient to introduce a logical generating matrix L such that
LH
T
= 0, rank L = 2k. Rows of L map to basis logical
operators; a non-zero linear combination of the rows of
L must be linearly independent from the rows of G.
Gauge or stabilizer generators can be measured with
a Clifford circuit which consists of ancillary qubit initial-
3
ization and measurement in the preferred (e.g., Z) ba-
sis, and a set of unitary Clifford gates, e.g., single-qubit
Hadamard H and phase P gates and two-qubit CNOT.
Generally, a Clifford unitary U maps Pauli operators to
Pauli operators, E
0
= U
EU , E, E
0
P
n
. Ignoring the
overall phase (for complete description, see Refs. 36 and
37), this corresponds to a linear map of the correspond-
ing binary vectors, (e
0
)
T
= Ce
T
, where C C
U
is a
symplectic matrix with the property C
T
ΣC = Σ.
B. Bernoulli distribution
Multi-variate Bernoulli distribution describes a joint
probability distribution of m single-bit variables x
i
F
2
,
e.g., components of a binary vector x F
m
2
. Most gener-
ally, such a distribution can be specified in terms of 2
m
probabilities p
x
0, with normalization
P
xF
m
2
p
x
= 1.
A convenient representation of such a distribution as an
exponent of a polynomial of m binary variables with real
coefficients was given by Dai et al. in Ref. 38. Namely, for
a single variable x {0, 1}, we can write
P
(x) = p
1x
0
p
x
1
as a product of two terms, where p
0
+ p
1
= 1 are the out-
come probabilities. For m variables we have, similarly,
the product of 2
m
terms,
P
(x) = p
(1x
1
)(1x
2
)...(1x
m
)
00...0
×p
(1x
1
)(1x
2
)...(1x
m1
)x
m
00...01
. . . p
x
1
x
2
...x
m
11...1
, (3)
where the probabilities are assumed positive, p
x
> 0. For
any given x F
m
2
only one exponent is non-zero so that
the result is p
x
. Taking the logarithm and expanding,
one obtains the corresponding “energy” E ln
P
(x)
as a polynomial of m binary variables. The correspond-
ing coefficients can be viewed as binary cumulants[38];
presence of high-degree terms indicates a complex prob-
ability distribution with highly non-trivial correlations.
For applications it is more convenient to work with
spin variables s
i
= (1)
x
i
1 2x
i
±1, and rewrite
the energy function using the general Ising representation
first introduced by Wegner[39],
E E ({s
j
}) =
X
b
K
b
R
b
+ const, R
b
=
Y
i
s
θ
ib
i
, (4)
parameterized by the binary spin-bond incidence matrix
θ with m rows, and the bond coefficients K
b
. While most
general m-variate Bernoulli distribution requires 2
m
1
bonds, in the absence of high-order correlations signifi-
cantly fewer terms may be needed. Of particular inter-
est are distributions with sparse matrices θ, e.g., with
bounded row and column weights, which also limits the
number of columns. In the simplest case of independent
identically-distributed (i.i.d.) bits with equal set proba-
bilities p
1
= p, p
0
= 1 p, we can take θ = I
m
, the iden-
tity matrix, and all coefficients equal, K = ln[(1p)/p]/2.
The logarithm ln[(1 p)/p] is commonly called a log-
likelihood ratio (LLR); the coefficient K here and K
b
in
Eq. (4) are thus called half-LLR coefficients.
III. ERROR CORRELATIONS IN A CLIFFORD
MEASUREMENT CIRCUIT
A. Pauli error channel with correlations.
Consider the most general Pauli error channel
ρ
X
eF
2n
2
P
(e)E
e
ρE
e
, (5)
where
P
(e) is the probability of an error E
e
with binary
representation e, with the irrelevant phase disregarded.
Technically,
P
(e) describes a 2n-variate Bernoulli distri-
bution. The probability
P
(e) can be parameterized in
terms of a 2n×m binary coupling matrix θ with m < 2
2n
columns, and a set of coefficients K
b
, b {1, . . . , m},
P
(e; θ, {K
b
}) = Z
1
exp
X
b
K
b
(1)
[e θ]
b
!
, (6)
where [e θ]
b
in the exponent is the corresponding compo-
nent of the row-vector e θ. The normalization constant
Z Z(θ, {K
b
}) in Eq. (6) is the partition function of the
Ising model in Wegner’s form, cf. Eq. (4),
Z(θ, {K
b
}) =
X
{s
i
∈±1}
Y
b
e
K
b
R
b
. (7)
In the simplest case of independent X and Z errors, θ =
I
2n
, the identity matrix, while e
2K
b
= (1 p
X
)/p
X
for
b n, and the corresponding expression with p
X
p
Z
for n < b 2n. In the case of the depolarizing channel
with error probability p, we have, instead,
θ =
I
n
0 I
n
0 I
n
I
n
, e
4K
= 3(1 p)/p, (8)
where the additional column block is to account for corre-
lations between X and Z errors. Additional correlations
between the errors can be introduced by adding columns
to matrix θ and the corresponding coefficients K
b
.
Given a probability distribution in the form (6), it is
easy to construct an expression for probability of an error
equivalent to e in a subsystem code with gauge generator
matrix G, extending the approach of Refs. 6, 40, and
41, and reproducing some of the results from Ref. 5. A
substitution e e + αG and a summation over α gives
P(e
0
F
2n
2
|e
0
' e) = 2
rank Gr
Z
, {K
b
(1)
[e θ]
b
}
Z (θ, {K
b
})
.
(9)
Here G is an r × 2n binary matrix, cf. Eq. (2), and the
prefactor accounts for a possible redundancy in the sum-
mation. Notice that the partition function in the nu-
merator has the same number of bonds m as that in the
denominator, but with the signs of the coefficients K
b
corresponding to non-zero elements of the binary vector
e θ flipped. When both the gauge generator matrix
G and the error correlation matrix θ are sparse, the inci-
dence matrix Θ in the numerator of Eq. (9) must
also be sparse.
4
B. Clifford measurement circuit and associated
input/output codes.
Let us now consider error correlations resulting from
a Clifford measurement circuit. Specifically, following
Refs. 21 and 22, consider an n-qubit circuit, n = n
0
+n
a
,
with n
0
data qubits and n
a
ancillary qubits. First, an-
cillary qubits are initialized to |0i, second, a collection
of Clifford gates forms a unitary U, and finally the an-
cillary qubits are measured in the Z basis, see Fig. 1.
In the absence of errors and in the event of all measure-
ments returning +1 (zero syndrome), the corresponding
post-selected evolution is described by the matrix
V = (I
n
0
h0|
n
a
) U (I
n
0
|0i
n
a
), (10)
where I is a single-qubit identity operator. The circuit is
assumed to be a good error-detecting circuit (good EDC),
namely, V
V be proportional to the projector onto a sub-
space Q
0
H
n
0
2
,
V
V = cΠ
0
, c > 0 (11)
see Def. 4 in Ref. 22. Here Q
0
, called the input code,
is an [[n
0
, k, d
0
]] stabilizer code encoding k qubits with
distance d
0
.
|ψi
/
n
0
U
/
n
0
|ψ
i
|0i
/
n
a
/
n
a
Z
FIG. 1. Generic circuit with n
0
data and n
a
ancillary qubits
initialized to |0i and measured in Z basis. In practice, an-
cillary qubit can be measured during evolution, and subse-
quently reused after initialization. However, there is no mech-
anism for adapting the gates to the measurement results.
A good EDC also defines an output code Q
0
0
H
n
0
2
which encodes the same number of qubits k. Indeed,
since V
V = cΠ
0
, matrix V has only one non-zero singu-
lar value,
c; this immediately gives V V
= cΠ
0
0
, with
the projector onto a space Q
0
0
H
n
0
2
of the same di-
mension 2
k
, the output code. Moreover, for any input
state |ψi Q
0
, the output |ψi
0
V |ψi Q
0
0
is in the
output code, and the corresponding transformation is a
(scaled) Clifford unitary.
Even though the map between Q
0
and Q
0
0
is unitary,
the distance d
0
0
of the output code does not necessarily
equals d
0
. In particular, adding a unitary decoding cir-
cuit on output data qubits may be used to render d
0
0
= 1.
C. Errors in a Clifford circuit.
Using standard circuit identities, any circuit error E
can be propagated forward to the output of the circuit,
thus giving an equivalent data error E
0
0
(E) P
n
0
and
the (gauge) syndrome σ
0
(E) F
n
a
2
corresponding to the
measurement results. Clearly, there is a big redundancy
even if phases are ignored, as many circuit errors can
result in the same or equivalent E
0
0
(E) and σ
0
(E). The
goal is to find the conditional probability distribution for
the equivalence class of the output error E
0
0
(E) given the
measured value of σ
0
(E).
In a given Clifford circuit, consider N possible error lo-
cations, portions of horizontal wires starting and ending
on a gate or an input/output end of the wire. For exam-
ple, the circuit in Fig. 2 has N = 15 error locations. A
Pauli error may occur on any of these locations. A circuit
error E is a set of N single-qubit Pauli operators with-
out the phase, P
i
{I, X, Y, Z}, i {1, . . . , N }. When
two circuit errors are applied sequentially, the result is
a circuit error whose elements are pointwise products of
Pauli operators with the phases dropped. The algebra
defined by such a product is isomorphic to the N -qubit
Pauli group without the phase. This is an Abelian group
which also admits a representation in terms of length-2N
binary vectors e (u|v) F
2N
2
. Multiplication of two
circuit errors amounts to addition of the corresponding
binary vectors e.
1
1
6
I
11
2
2
7
12
3
3
I
8
13
|0i
4 9 14
Z
|0i
5 10 15
Z
FIG. 2. Circuit measuring generators Z
1
Z
2
and Z
2
Z
3
of a
three-qubit toy code. Digits indicate distinct circuit locations.
Identity operators I are inserted so that the number of circuit
locations along each qubit wire be odd.
With these definitions, error propagation through a
Clifford circuit can be described as products of the orig-
inal circuit error E with trivial circuit errors which have
no effect on the outcome. The collection of all such errors
forms error equivalence group (EEG) of the circuit. The
generators of this group involved in error propagation are
trivial errors, each formed as a union of a single-qubit
Pauli P
i
, i {1, . . . , N} (but not in the output for data
qubits, or right before the measurements for ancillary
qubits) with the result of error propagation of P
i
across
the subsequent gate. For example, for the circuit in
Fig. 2, some of the EEG generators, with identity opera-
tors dropped, are {Z
1
, Z
6
}, {X
4
, X
9
}, {X
1
, X
6
, X
9
}, and
{Z
4
, Z
6
, Z
9
} (propagation across the leftmost CNOT),
and {X
3
, X
8
}, {Z
3
, Z
8
} (propagation across the leftmost
identity gate labeled I).
In addition, for a circuit which includes qubits’ ini-
tialization and measurement, the list of EEG generators
includes errors Z
j
on ancillary qubits right after initial-
ization to |0i and right before the Z-basis measurement.
For the circuit in Fig. 2, these are {Z
4
}, {Z
5
} (ancillary
qubits right after initialization), and {Z
14
}, {Z
15
} (an-
5
cillary qubits just before measurement). It is important
that stabilizer group of the input code, after its elements
are promoted as circuit errors, forms a subgroup of thus
constructed full circuit EEG[22]. Same applies to the
stabilizer group of the output code.
In the binary representation, a single-qubit Hadamard
gate connecting circuit positions i and i
0
corre-
sponds to a pair of generators with non-zero elements
(u
i
, v
i
|u
i
0
, v
i
0
) = (10|01) and (01|10) (X
i
propagates into
Z
i
0
and v.v.), a phase gate similarly corresponds to
(10|10) and (01|11) (Z
i
propagates into Z
i
0
and X
i
into
Y
i
0
), and a CNOT gate (10 00|10 00) (input Z
i
on the con-
trol and an output Z
i
0
on the same wire), (01 00|01 01)
(input X
i
on the control and X
i
0
, X
j
0
on the out-
puts), (00 01|00 01) (target X pass-through), and, finally,
(00 10|10 10). The generators for the single-qubit trivial
errors are even simpler, since a Z
j
maps to the weight-one
vector (u
j
, v
j
) = (10).
For a given circuit error E with binary form e F
2N
2
,
any equivalent error can be obtained by adding a linear
combination of thus constructed circuit EEG generators
g
j
. It is convenient to combine the corresponding gen-
erators into a generator matrix G with 2N columns. As
discussed in the Appendix A, for a good EDC with the
rank of the input-code stabilizer group r
0
= n
0
k and
the constant c = 1/2
κ
, see Eq. (10) and below, the gen-
erator matrix has the rank
rank G = 2N 2n
0
f, f n
a
κ r
0
0. (12)
Generally, a non-zero value f > 0 indicates a measure-
ment redundancy.
With the circuit EEG generator matrix in place, it is
easy to construct a formal expression for the probability
of an error in a given equivalence class. Namely, if the
circuit error probability distribution is given by an analog
of Eq. (6), the probability of an error equivalent to a given
e F
2N
2
is proportional to the Ising partition function,
P(e
0
F
2N
2
|e
0
' e) = Const Z(, {K
b
(1)
[eθ]
b
}), (13)
cf. Eq. (9). It is also easy to find the conditional probabil-
ity distribution of output errors with a given syndrome.
Namely, given the binary form of an output data error
e
0
0
F
2n
0
2
and a syndrome vector σ
0
F
n
a
2
, we need
to form the corresponding vector e e(e
0
0
, σ
0
) F
2N
0
,
filling only the components corresponding to the output
data qubits and ancillary X
j
just before the measure-
ments which give non-zero syndrome bits in σ
0
.
For ML decoding, we compare thus computed prob-
abilities for all inequivalent errors consistent with the
measured syndrome σ
0
. In particular, these include er-
rors that differ by a logical operator of the input (or,
equivalently, the output) code, since non-trivial logical
operators are outside the circuit EEG. In other words,
length-2N binary vectors corresponding to logical opera-
tors are linearly independent from the rows of the circuit
EEG generator matrix G. It is convenient to define a
binary logical generator matrix L of dimension 2k ×2N,
whose rows correspond to mutually inequivalent logical
operators of the input code.
For the ease of decoding, it is also convenient to intro-
duce the parity-check matrix H, also with 2N columns,
whose rows are orthogonal to the rows of both matrices
G and L, and whose rank satisfies
rank H = 2N rank G rank L = n
a
κ + r
0
. (14)
Clearly, H is dual to a matrix combing the rows of G and
L. As in the case of a subsystem code, see Eq. (2), these
matrices can be seen as forming a half of a CSS code,
with stabilizer generator matrices G
X
= G, G
Z
= H,
and X-logical operator generator matrix L
X
= L.
The orthogonality requirement for rows of H can also
be interpreted in terms of the N -qubit Pauli group as-
sociated with the circuit. Namely, the Pauli operators
corresponding to the rows of the symplectic dual matrix
e
H = HΣ, see Eq. (1) and below, must commute with
generators of the circuit EEG. This guarantees that each
of these operators be a spackle, i.e., a circuit error where
the single-qubit Pauli operators in any time layer can be
obtained by error propagation from those in the previos
time layer, see Ref. 22. Respectively, row weights of H
scale linearly with the circuit depth.
D. Circuit subsystem code
The discussion in Ref. 22 focused on the special case
of good EDCs where each qubit line is required to have
an odd number of locations. In this special case, the
circuit EEG can be seen as the gauge group of a quan-
tum subsystem code of length N which encodes the same
number of qubits k as the input/output codes, and has
a distance not exceeding the corresponding distances,
d min(d
0
, d
0
0
). Respectively, rows of the matrix
e
H
which correspond to generators of the stabilizer group of
the subsystem code are necessarily given by linear com-
binations of the rows of G. In addition, circuit errors
corresponding to the rows of the matrix L can be seen as
dressed logical operators, while the bare logical operators
which commute with the elements of the circuit EEG can
be constructed as spackles.
In practice, any circuit can be easily modified to satisfy
this additional requirement by inserting a null (identity)
gate into each qubit line with an even number of loca-
tions, see Fig. 2 for an example. While it is not strictly
necessary to work with circuits that satisfy this require-
ment, it is convenient, as the additional structure of the
subsystem code can be used to verify the validity of the
constructed matrices.
However, once the matrices G, H, and L are con-
structed, there is no need to refer to the subsystem code.
In fact, the generator matrix row-reduction transforma-
tion described in the following Section IV preserves the
orthogonality and the relation Eq. (14) between the ranks
inherent in the CSS code map, but not the structure of
the circuit subsystem code.
6
E. Circuit code distance.
How good can a measurement protocol be? What are
the bounds on the distance d of the subsystem code as-
sociated with the circuit?
Generally, if d
0
and d
0
0
are the distances of the input
and the output codes, the distance of the corresponding
circuit code satisfies d min(d
0
, d
0
0
). This follows from
the fact that a logical operator of the input code, e.g.,
is naturally mapped to a (dressed) logical operator of
the circuit code. An important result in Ref. [22] is that
one can always design a fault-tolerant circuit so that the
distance d of the corresponding subsystem code be as
good as that of the input code, d = d
0
.
Unfortunately, circuit-code distance d does not have a
direct relation to the probability distribution of the out-
put errors; even single-qubit output errors may remain
undetected. This is a well known “feature” of quan-
tum error-correcting codes operating in a fault-tolerant
regime, even for codes with single-shot properties[27–29].
Indeed, regardless of the circuit structure, errors on the
data qubits in the locations just before the circuit output
will not be detected.
In comparison, with a formally defined circuit code,
such an error can be propagated back to the input layer
and (when it is detectable, e.g., if its weight is smaller
than the distance d
0
0
of the output code) it would nec-
essarily anticommute with one or more combination(s)
Z
g
of the ancillary qubits. The original error would thus
be detectable in the circuit code. Causality does not
permit such an operation with actual circuit evolution.
Formally, this functionality is removed due to assumed
ancillary qubit initialization to |0i.
Of course, even if a small-weight error goes undetected,
it may get corrected after one or more additional mea-
surement rounds. In practice, when an error-correcting
code is analyzed in a fault-tolerant setting, the standard
numerical procedure is to add a layer of perfect stabilizer
measurements (no measurement errors). This guarantees
that all small-weight errors at the end of the simulation
be detected, and thus recovers the distance d > 1 of the
circuit code, without the need to violate causality.
IV. MARGINAL DISTRIBUTIONS FOR
CORRELATED ERRORS
The circuit EEG fully describes correlations between
the circuit errors. However, it also contains a lot of ex-
cessive information: for the purpose of error correction,
we are only interested in the distribution of the output
errors and the syndrome, which are all supported at the
rightmost locations of the circuit. In addition, the large
size and sparsity of the circuit generator matrix G makes
decoding difficult, except with the simplest circuits.
Present goal is to reduce the number of independent
variables, by constructing the marginal distribution for a
given subset of the variables. This amounts to a summa-
tion over the variables one is not interested in, e.g.,
P(e
s+1
, . . . , e
m
) =
X
e
1
X
e
2
. . .
X
e
s
P(e
1
, e
2
, . . . , e
m
). (15)
In the case of binary variables e
i
{0, 1}, both the orig-
inal and the resulting marginal distributions are multi-
variate Bernoulli distributions, and each can be described
in terms of the Ising energy function (4).
A. Row-reduction transformation
1. Generator matrix and the coupling coefficients
Given an n-variate Bernoulli distribution described by
the coupling matrix Θ, e.g., Θ = in Eq. (9), and a set
of half-LLR coefficients K
b
, 1 b m, what are the cor-
responding parameters of the marginal distribution (15)?
In the equivalent Ising-model representation (4), the goal
is to describe the couplings between the remaining spins
after a partial summation. Such a star-polygon transfor-
mation for a general Ising model has been constructed in
Ref. 23. The transformation includes two special cases
long known in the Ising model literature: the Onsager’s
star-triangle transformation[42] and the (inverse) deco-
ration transformation[43, 44].
It is convenient to derive the result directly, focusing
on the marginal distribution after a summation over just
one spin variable s
i
±1 corresponding to i th row of
Θ. Without limiting generality, assume that the chosen
row has w non-zero elements in positions 1, 2, . . . , w,
decompose the corresponding bond variables R
b
= s
i
T
b
,
T
b
±1, 1 b w, and perform the summation explic-
itly (with the additional one-half factor for convenience),
B
τ
1
2
X
s
i
=±1
exp
s
i
w
X
b=1
K
b
T
b
= cosh
w
X
b=1
K
b
T
b
, (16)
where τ F
w
2
is a composite index with elements τ
b
such that T
b
= (1)
τ
b
. To exponentiate this expression,
rewrite B
τ
by analogy with Eq. (3),
B
τ
= B
1+τ
1
2
1+τ
2
2
...
1+τ
w
2
00...0
B
1+τ
1
2
1+τ
2
2
...
1+τ
w1
2
1τ
w
2
00...01
×. . . B
1τ
1
2
1τ
2
2
...
1τ
w
2
11...1
, (17)
where the coefficients B
...
in the base of the exponents
are the hyperbolic cosines (16) of the sum of coefficients
±K
b
with the signs fixed, and matching exactly the signs
in the exponents. As in Eq. (3), after a substitution
of any given τ F
w
2
, only one term with the correct
index τ remains in the product. The modified bonds
and the corresponding coefficients K
0
b
are obtained by
expanding the polynomial in the exponent of Eq. (17).
Because of the symmetry of the hyperbolic cosine, only
even-weight products of the original bonds result from
this expansion. Thus, in general, for an original row of
weight w, the corresponding w columns are combined to
7
produce w
0
= 2
w1
1 even-weight column combinations,
a change of w = 2
w1
w 1 columns.
Specifically, for a row of weight w = 1, the transfor-
mation amounts to simply dropping the row and the cor-
responding column of Θ. The values of K
b
remain the
same, except for the one value that is dropped.
For a row of weight w = 2, only the sum of the corre-
sponding columns is retained in Θ, with the coefficient
K
0
1,2
=
1
2
ln
cosh(K
1
+ K
2
)
cosh(K
1
K
2
)
1
2
ln
B
00
B
01
,
cf. Eq. (16). Equivalently, tanh K
0
1,2
= tanh K
1
tanh K
2
.
For a row of weigth w = 3, the three columns of the
original matrix Θ are replaced by their pairwise sums,
with the coefficient
K
0
1,2
=
1
4
ln
B
000
B
001
B
010
B
011
for the combination of the first two columns. The re-
maining coefficients K
0
2,3
and K
0
3,1
can be obtained with
cyclic permutations of the indices.
For a row of weight w = 4, the four columns of the
original matrix Θ are replaced with six pairwise sums and
the seventh column combining all four original columns,
with the coefficients, e.g.,
K
0
1,2
=
1
8
ln
B
0000
B
0001
B
0010
B
0011
B
0100
B
0101
B
0110
B
0111
,
K
0
1,2,3,4
=
1
8
ln
B
0000
B
0011
B
0101
B
0110
B
0001
B
0010
B
0100
B
0111
.
In general, the coefficient K
0
J
in front of the product of
T
b
with indices b in an (even) subset J {1, 2, . . . , w} is
given by the sum of logarithms of the hyperbolic cosines
B
τ
with τ
1
= 0 (this accounts for symmetry of hyperbolic
cosine), with the coefficients ±1/2
w1
, where the sign is
determined by the parity of the weight of the subset τ [J]
restricted to J. It is easy to check that the numbers
of positive and negative coefficients always match. Re-
spectively, the coefficients for high-weight products are
typically small in magnitude.
2. Transformation for a parity check matrix
The row-reduction transformation can also be writ-
ten in terms of the parity-check matrix H, also with m
columns, and dual to Θ, such that
T
= 0 and rank H = m rank Θ. (18)
To this end, consider the row-reduction of Θ as a combi-
nation of the following elementary column steps:
(i) The 1st column of Θ is added to columns 2, 3, . . . ,
w of Θ; as a result the i th row of Θ has a non-zero
element only in the first position.
(ii) The 1st column of thus modified Θ is dropped,
which leaves the i th row zero—it may be dropped
as well;
(iii) If w > 2, 2
w1
w combinations of two, three, . . . ,
w 1 columns with indices b w 1 are added
to the matrix Θ. These can be sorted by weight
so that each added column be a combination of
exactly two existing columns in the modified Θ.
The corresponding steps for H are:
(i
0
) Columns b {2, 3, . . . , w} of H are added to its 1st
column, which becomes identically zero as a result.
(ii
0
) Drop all-zero 1st column from thus modified H.
(iii
0
) For each column, e.g., b
0
, added to Θ as a linear
combination of two existing columns b
1
and b
2
, H
acquires a new row with the support on {b
1
, b
2
, b
0
}
to express this linear relation.
It is easy to check that row orthogonality,
T
= 0, is
preserved. Also, the rank of Θ is reduced by one, while
the increase of the rank of H matches the number of
columns added in step (iii
0
), so that the exact duality
(18) is preserved.
B. Marginal distribution for error equivalence
classes
This analysis is easily carried over to the problem of
syndrome-based ML decoding for an [[n, k]] subsystem
code under a Pauli channel characterized by a 2n × m
matrix θ and a set of m half-LLR coefficients {K
b
}, see
Sec. III A. Given a gauge generator matrix G, the prob-
ability of an error equivalent to e is given by Eq. (9).
Generally, for ML decoding we need to choose the largest
of the 2
2k
partition functions
Z(Gθ, {K
b
(1)
[e
0
θ]
b
), e
0
= e + αL, α F
2k
2
. (19)
Typically, this needs to be done for a large number of er-
ror vectors e. Can the calculation be simplified en masse
by doing partial summation over the spins corresponding
to all rows of as described in the previous section?
The structure of the logical operators can be accounted
for by extending the rows of the generator matrix which
now has two row blocks,
Θ =
!
,
˜
K
b
= K
b
(1)
[eθ]
b
, (20)
and the half-LLR coefficients
˜
K
b
, b {1, 2, . . . , m}. A
matching parity check matrix H is dual to Θ, see Eq. (18).
1. Independent X and Z errors
Let us first consider the simpler case of independent
X and Z errors, where matrix θ = I
2n
. In this case H =
H =
e
HΣ, where
e
H is a generator matrix of the code’s
stabilizer group, see Eq. (2). Marginal distribution being
independent of the choice of the generator matrix, use
8
row transformations and column permutations to render
Θ =
G
L
!
=
I
r
A B
I
2k
C
!
, (21)
H =
B
T
+ C
T
A
T
C
T
I
`
, (22)
where matrices B and C have ` 2n r 2k columns,
and r = rank(G). The matrices Θ and H are mutually
dual as can be immediately verified.
Row-reduction operations applied to each row in the
upper block of Θ correspond to:
(i
00
) Column operations to set both blocks A and B to
zero, and conjugate column operations on H to set
its left-most column block to zero.
(ii
00
) Drop the upper row-block of the obtained Θ, as
well as the left-most column blocks of the resulting
Θ and H.
(iii
00
) Add an extra column block M
1
+ CM
2
to the re-
sulting Θ, where columns of M
1
and M
2
specify
the linear combinations of the columns in its two
remaining blocks, and a matching row-block to H.
As a result, the transformed matrices acquire the form
Θ
0
=
I
2k
C M
1
+ CM
2
, (23)
H
0
=
C
T
I
`
0
M
T
1
M
T
2
I
!
, (24)
where double vertical lines are used to separate the newly
added columns.
When the described transformation is applied to any
of the original partition functions (19), the result is just
an exponential exp
P
b
K
(fin)
b
of the sum of the final
half-LLR coefficients. Can identical columns of the final
generator matrix (23) be similarly combined to simplify
the structure of the final marginal distribution for error
equivalence classes? The answer is yes, as long as we
account for the effect of the error e on the values of the
coefficients K
(fin)
b
.
In fact, it is easy to check that the row-reduction trans-
formations in Sec. IV A are such that the additional signs
in Eq. (19) only affect the signs of the coefficients K
(fin)
b
.
Moreover, these signs correspond to the bits of the trans-
formed error vector, cf. Eq. (23),
e
(fin)
=
ε
1
ε
2
ε
1
M
1
+ ε
2
M
2
, (25)
where the vector ε
1
selects the equivalence class, and
ε
2
= s + ε
1
C, with s eH
T
the original syndrome.
Clearly, the right-most blocks in Eqs. (23) and (25) are
obtained from ε (ε
1
|ε
2
) with 2k + ` = 2n r com-
ponents as a right product with the combined matrix
M =
M
1
M
2
. All `
0
2n r components of ε being inde-
pendent, there are
m
0
max
= 2
`
0
1 (26)
non-trivial combinations. Combining identical columns
in the transformation from ε to e
(fin)
, we can ensure that
the final matrices contain no more than m
0
max
columns.
With Eq. (2), r = nk+κ, so that `
0
= n+kκ, where
κ is the number of gauge qubits in the subsystem code.
For a stabilizer code, κ = 0, thus `
0
= n + k. Clearly, the
latter is just the number of logical generators, 2k, plus
the number of independent syndrome bits, n k.
2. A more general Pauli channel
Instead of considering most general situation, consider
an important case of a Pauli channel where any single-
qubit error has a non-zero probability. Then, the inci-
dence matrix can be chosen to have an identity-matrix
block, θ = (I
2n
|T ), where T is an (m 2n) × 2n binary
matrix. As a result, the matrix Θ and the half-LLR co-
efficients in Eq. (20) both acquire additional blocks of
linearly-dependent components, while the parity-check
matrix dual to Θ can be chosen in the form
H =
H 0
T
T
I
m2n
!
. (27)
Since the relevant error in Eq. (20) is eθ = (e|eT ), the
lower row-block of H gives a zero syndrome, just like the
lower row-block in Eq. (24). Basically, after degeneracy
is taken into account, the number of independent com-
ponents of eθ is 2n r, the same as for e; final matrices
Θ
0
and H
0
can always be constructed to have m
0
m
0
max
components given by Eq. (26). Of course, the actual re-
sulting matrices, as well as the final half-LLR coefficients
do depend on the assumed error model.
How much freedom is there to choose the matrices Θ
0
and H
0
? For the purpose of ML decoding, we need to
go over the entire linear space C
Θ
0
generated by the rows
of Θ
0
; the choice of basis is irrelevant. The same is true
regarding the parity check matrix H
0
. These are the same
symmetries as for a generator and a parity-check matrices
of a binary code.
In essence, the original quantum subsystem code with
gauge G and logical L generator matrices has been trans-
formed into a classical binary code, with the transfor-
mation dependent in a non-trivial fashion on the error
probability distribution. This binary code has length
m
0
m
0
max
, and it encodes 2k bits.
Further, any non-trivial linear combination of rows of
with rows of has weight lower-bounded by the dis-
tance of the quantum code, which gives the lower bound
d
0
d on the distance of this binary code. In general,
given the structure of the row-reduction transformation,
the distance may be quite a bit larger, possibly scaling
linearly with m
0
. Notice, however, that with highly non-
uniform error probability distribution, more relevant pa-
rameter is not the distance, but the corresponding quan-
tity weighted with half-LLR coefficients K
(fin)
b
, related
to the probability of the most-likely logical error. By
9
construction, this quantity is exactly the same as for the
original quantum code.
Let us consider an important case of a sparse origi-
nal parity check matrix H, e.g., with row and column
weights bounded. This requires a low-density parity-
check (LDPC) code with a sparse stabilizer generator
matrix
e
H = HΣ, and an error channel with the gen-
erator matrix θ = (I
2n
|T ) also sparse. When acting
on the parity check matrix, each row-reduction trans-
formation drops a column of H, and may also add one
or more rows of weight 3, see Sec. IV A 2. Thus, the
parity-check matrix of the classical binary code describ-
ing the marginal probability distribution of error equiv-
alence classes must also be sparse, with row weights not
exceeding w
0
max(w, 3), where w is the maximum row
weight of H in Eq. (27).
C. Marginal distribution for output errors in a
good measurement circuit
This discussion also applies to the code associated with
the error equivalence group of a good EDC. In this case
the matrices G and H have 2N columns each, where N is
the number of circuit locations, and their ranks are given
by Eqs. (12) and (14). Just as any circuit error can be
pushed all the way to the right, row-reduction can also be
done starting with the bits at the beginning of the circuit
and pushing toward its output. This way, a circuit error
equivalence class can be characterized by
`
1
= 2N rank G = 2n
0
+ f = (n
0
+ k) + (n
a
κ) (28)
bits, where n
0
+ k is the number of linearly-independent
error equivalence classes in an [[n
0
, k]] stabilizer code and
n
a
κ is the number of syndrome bits measured in the
circuit. Alternatively, as in a subsystem code with κ
gauge qubits, `
0
= n
0
+k κ is the exponent in Eq. (26),
and n
a
is the number of additional error positions right
before the measurements. As in the previous section, this
gives an upper bound on the maximum number M
0
of
columns in the matrices Θ
0
and H
0
that may be necessary,
M
0
M
0
max
= 2
`
1
1. (29)
After row-reduction for all generators of the circuit EEG,
we get a classical [M
0
, 2k, d
0
] binary code, where k is the
number of encoded qubits, and d
0
d.
Is this an LDPC code? This question has not been
answered in the previous section: the parity check matrix
H of the circuit code is not necessarily sparse as its row
weights scale linearly with circuit depth. To analyze the
sparsity of the output-error parity-check matrix H
0
, write
it in a block form similar to Eq. (27),
H
0
=
H
0
0
0
H
00
0
H
00
1
!
, (30)
where the upper row-block contains only the original
columns of the circuit EEG parity check matrix H re-
maining after row-reduction steps (i
0
) and (ii
0
), while any
row with exactly three non-zero entries added in a step
(iii
0
) goes to the lower row-block. Further, if we assume
circuit EEG H in the form (27), with the lower row-
block of bounded weight (e.g., w = 3 for depolarizing
errors), any potentially large-weight row in H
0
0
must cor-
respond to (i) a stabilizer generator of the output code,
or (ii) a generator of the group H
0
, see Ref. 22. All of
these have bounded weights for any bounded-weight sta-
bilizer LDPC code with a measurement circuit where sta-
bilizer generators are measured independently and with a
bounded number of ancillary qubits. On the other hand,
these conditions do not generally hold for any family of
concatenated codes based on a given code, or for sub-
system codes where a stabilizer generator may have an
unbounded weight or cannot be expressed as a product of
gauge generators with a bounded number of terms. Nev-
ertheless, even in these cases, given a potentially very
large number of columns (29), it is reasonable to expect
the final H
0
to be a sparse matrix, with only a small frac-
tion of non-zero elements.
V. DECODING STRATEGIES
A. Decoding based on circuit EEG
The present approach is to analyze error propagation
in a Clifford measurement circuit in terms of its circuit
EEG characterized by a logical generator matrix L and
either a generator matrix G or a parity check matrix H.
With a minor constraint on the circuit, these correspond
to the circuit subsystem code constructed in Refs. 21 and
22. More generally, these matrices form (a half of) a CSS
code, with generator matrices G
X
= G and G
Z
= H,
while rows of L define X-type logical operators. Simply
put, any decoder capable of dealing with a CSS code with
uncorrelated X-type errors can now be used to account
for error correlations in a given measurement circuit.
Additional correlations can also be accounted for. As-
suming a Pauli error channel (with or without correla-
tions between different circuit locations) characterized by
a coupling matrix θ and a set of half-LLR coefficients K
b
(one per column), probability of a circuit error equivalent
to a given one is given by the Ising partition function (13),
with the spin-bond coupling matrix .
As already discussed, for ML decoding we need to find
the largest of the Ising partition functions (19). Such
a calculation can be expensive. Indeed, given a code
encoding k qubits, we need to compute and choose the
largest of all 2
2k
partition functions corresponding to all
non-trivial sectors; this can only be done in reasonable
time for a code with k small. Previously, a computa-
tionally efficient ML decoder using this approach and 2D
tensor network contraction for computing the partition
functions has been constructed for surface codes[45] in
the channel model (perfect syndrome measurement).
Feasible approaches for evaluating the partition func-
tions (19) include tensor network contraction (see, e.g.,
10
in Ref. 46, for a 3D network contraction with complexity
scaling as
9
, where χ is the bond dimension) and
Monte-Carlo (MC) methods constructed specifically for
efficient calculations of free energy differences, e.g., the
non-equilibrium dynamics method [47] or the classical
Bennett acceptance ratio[48]. In application to surface
codes in FT regime, MC calculations are in essence sim-
ulations of bond-disordered 3D Ising model; such calcula-
tions can be done using GPU[49], FPGA[50], or TPU[51]
hardware acceleration.
Notice also that the circuit code is extremely degener-
ate: with Hadamard, Phase, and CNOT gates the rows
of the generator matrix G have (quaternary) weights not
exceeding three. On the other hand, the row weights
of the parity-check matrix H scale linearly with the cir-
cuit depth. By these reasons, iterative decoders like be-
lief propagation (BP) are expected to fare even worse
than with the usual (not so degenerate) quantum LDPC
codes[52–54].
B. Generator-based decoding via marginal
distributions
The calculation of the Ising partition functions needed
for ML decoding can be simplified via partial summation
over a subset of the spins. Technically, this corresponds
to using a marginal distribution for the subset of variables
needed, as discussed in Sec. IV.
Denote V the set of rows of the original generator ma-
trix G [also, rows in the upper block of Θ in Eq. (20)].
Then, row-reduction in an increasing sequence of subsets
I
0
I
1
I
2
. . . I
s
V (31)
defines a sequence of mutually-dual pairs of matrices
{Θ
(j)
, H
(j)
} with m
j
columns, where
Θ
(j)
=
G
(j)
L
(j)
!
, j {0, . . . , s}. (32)
The ranks of logical generator matrices remain the same,
rank L
(j)
= 2k, while the sequence r
j
rank G
(j)
is
decreasing with increasing j, ending at r
s
= 0. In
essence, this defines a sequence of asymmetric (half) CSS
codes [[m
j
, 2k]], with generator matrices G
(j)
X
= G
(j)
and
G
(j)
Z
= H
(j)
, where rows of L
(j)
define X-type logical
operators. The sequence ends with a CSS code with an
empty X-generator matrix, i.e., a classical binary code
with the parity check matrix H
(s)
.
For each of the codes in the sequence (32), there is
also a set of half-LLR coefficients {K
(j)
b
, 1 b m
j
}.
Given an error vector e F
m
j
2
matching the syndrome,
ML decoding can be done by choosing the largest of the
2
2k
Ising partition functions [cf. Eq. (19)]
Z
G
(j)
, {K
(j)
b
(1)
e
0
b
}
, e
0
= e + αL
(j)
, α F
2k
2
.
(33)
By construction, the result should be the same for ev-
ery j s. However, the complexities for computing the
partition functions may differ dramatically.
One possibility for exact ML decoding is thus to choose
a subset I V to optimize the partition function calcula-
tion. In particular, one option is to choose I
1
such that
all rows of weights one, two, and three are eliminated
from the corresponding matrix G
(1)
. This minimizes the
number of columns in the exact generator matrix of the
marginal-distribution. Even though all rows in the origi-
nal circuit EEG generator matrix G have row weights not
exceeding three, row-reduction on rows of weight three
tends to create higher-weight rows. Thus, in general,
r
1
> 0: the resulting partition function is not expected
to be trivial.
To quote some numbers, Tables I and II give the dimen-
sions of generator matrices and row-reduced generator
matrices for two code families: the toy codes [[n
0
, 1, d
X
=
n
0
/d
Z
= 1]] with stabilizer generators Z
i
Z
i+1 mod n
0
,
0 i < n
0
, and the rotated toric codes (Ex. 11 in Ref. 55)
[[n
0
, 1, 2t + 1]] with n
0
= t
2
+ (t + 1)
2
and stabilizer gen-
erators Z
i
X
i+t mod n
0
X
i+t+1 mod n
0
Z
i+2t+1 mod n
0
, where
0 i < n
0
and t = 1, 2, . . ., with up to three measure-
ments cycles. The simplest examples of the measurement
circuits used for these two code families are shown in
Figs. 3 and 4, respectively. In particular, while the origi-
nal generator matrix for the circuit EEG of the [[13, 1, 5]]
code with measurements repeated n
cyc
= 3 times has di-
mensions 1066 × 1092, row-reduction of rows of weight
up to three reduces the dimension to 182 × 247.
FIG. 3. Single-cycle circuit measuring the overcomplete set
{Z
1
Z
2
, Z
2
Z
3
, Z
3
Z
1
} of stabilizer generators for the toy code
with n
0
= 3 data and n
a
= 3 ancillary qubits.
Second possibility for exact ML decoding is to perform
a summation over all spins, as described in Sec. IV C.
This gives a probability distribution directly for the er-
ror equivalence classes; the logarithm of probability of
an error equivalent to e is given, up to an additive con-
stant, by a weighted sum of the half-LLR coefficients
2
P
b
e
b
K
(fin)
b
. Unfortunately, such an exact expression
is expected to have an exponentially long list of coeffi-
cients, see Eqs. (28) and (29) for an upper bound. The
corresponding column numbers are large even for the rel-
atively simple circuits in Tables I and II. In fact, the
11
circuit parameters generator matrix dimensions remaining columns m
n
0
n
a
n
cyc
d
0
orig w = 2 w = 3 final w
fin
= 10
2
= 10
1
`
1
3 3 1 3 30 × 36 3 × 10 0 × 10 0 × 10 0 10 10 7
5 5 1 5 50 × 60 5 × 16 0 × 16 0 × 16 0 16 16 11
7 7 1 7 70 × 84 7 × 22 0 × 22 0 × 22 0 22 22 15
3 6 2 3 72 × 78 9 × 19 3 × 19 2 × 43 21 38 21 10
5 10 2 5 120 × 130 15 × 31 5 × 31 3 × 79 21 69 34 16
7 14 2 7 168 × 182 21 × 43 7 × 43 4 × 115 21 99 48 22
3 9 3 3 102 × 108 15 × 28 6 × 27 4 × 75 35 69 31 13
5 15 3 5 170 × 180 25 × 46 10 × 45 7 × 116 20 104 50 21
7 21 3 7 238 × 252 35 × 64 14 × 63 10 × 157 20 140 69 29
TABLE I. Parameters of the original and row-reduced generator matrices for repetition code circuits as in Fig. 3 but with n
cyc
measurement cycles, n
0
data and n
a
= n
cyc
n
0
ancillary qubits. Also shown are dimensions of row-reduced generator matrices
with rows of weights w = 2 and w = 3 (and smaller) eliminated; w
fin
is the minimum row-weight of the final generator matrix
with the smallest number of rows remaining. Remaining columns m
is the number of columns after columns with |K
b
| < are
dropped from the final generator matrix, assuming p = 0.05 corresponding to a half-LLR value K 1.472. The last column
gives the value of `
1
in the upper bound (29) on the number of columns.
circuit parameters generator matrix dimensions remaining columns m
n
0
n
a
n
cyc
d
0
orig w = 2 w = 3 w = 4 final w
fin
= 10
3
= 10
2
= 10
1
`
1
5 5 1 3 130 × 140 25 × 35 20 × 35 16 × 43 13 × 75 10 65 47 23 11
5 10 2 3 280 × 290 55 × 65 45 × 65 37 × 81 32 × 139 10 112 82 46 16
5 15 3 3 410 × 420 85 × 95 70 × 95 58 × 119 51 × 203 10 159 118 70 21
13 13 1 5 338 × 364 65 × 91 52 × 91 40 × 115 32 × 163 9 163 120 61 27
13 26 2 5 728 × 754 143 × 169 117 × 169 93 × 217 83 × 291 9 273 213 118 40
13 39 3 5 1066 × 1092 221 × 247 182 × 247 146 × 319 134 × 419 9 384 305 182 53
25 25 1 7 650 × 700 125 × 175 100 × 175 76 × 223 58 × 331 9 331 234 116 51
25 50 2 7 1400 × 1450 275 × 325 225 × 325 177 × 421 157 × 555 9 528 413 223 76
TABLE II. Same as in Tab. I but for rotated toric codes [[t
2
+ (t + 1)
2
, 1, 2t + 1]] with t = 1, 2, 3, represented as single-generator
cyclic codes, see Example 11 in Ref. 55.
|q
0
i
H
H
|q
1
i
H
H
|q
2
i
H
H
|q
3
i
H
H
|q
4
i
H
H
|q
5
i |0i
|q
6
i |0i
|q
7
i |0i
|q
8
i |0i
|q
9
i |0i
FIG. 4. Cingle-cycle measurement circuit for the five-qubit
code with n
a
= 5 ancillary qubits.
Mathematica program (which was not written for effi-
ciency) failed to complete full row-reduction except for
the simplest repetition codes with n
cyc
= 1 round of mea-
surements.
In practice, the list of coefficients K
(fin)
b
often con-
tains a large number of entries with small magnitudes.
This suggests a range of approximations, where only suf-
ficiently large coefficients are preserved, e.g., |K
(fin)
b
| > ,
for a given > 0. For incompletely reduced matrices in
Tables I and II, with = 0.1, the number of columns is re-
duced, roughly, by a factor of two. On general grounds,
the reduction factor is expected to be much larger for
fully-reduced generator matrices.
Alternatively, only a fixed number χ of the largest
in magnitude coefficients may be preserved. This latter
approach is similar in spirit to approximate tensor net-
work contraction using singular value decomposition and
a fixed maximum bond dimension. Notice that if only the
coefficients corresponding to columns of the matrix H
0
0
in
Eq. (30) are preserved, we get an approximation similar
to the conventional phenomenological noise model.
The “history code” decoding algorithm suggested by
Chubb and Flammia[5], can be seen as a special case of
generator-based decoding. Here the measurement circuit
is assumed to have a block structure, and summation is
12
done over all circuit locations with the exception of those
at the output of each block. With the circuits in Tables
I and II, a block corresponds to a single measurement
cycle. The full probability distribution is then recov-
ered using Bayes rules, assuming no correlations between
measurements in different blocks. Evidently, even in this
case, the complexity of the probability distribution ac-
counting for full error correlations could be prohibitive
for exact ML decoding.
C. Parity-based decoding via marginal
distributions
Summations over the spins in the subsets I
j
with in-
creasing j from a sequence like (31) give marginal distri-
butions which account exactly for increasing numbers of
alternative spin configurations. Respectively, the degen-
eracy of the corresponding row-reduced half CSS codes
decreases with increasing j, down to a classical code with
no degeneracy at the end of the sequence, j = s. A gen-
eral expectation is that the accuracy of minimum-energy
(ME) decoding would be increasing with increasing j.
ME decoding becomes strictly equivalent to ML decod-
ing for the end-of-the-sequence classical code.
Formally, given a parity-check matrix H and a set of
LLR weights K
b
, ME decoding aims to find an error vec-
tor e which gives the correst syndrome eH
T
= s and
maximizes the error likelihood
P
b
(1)
e
b
K
b
or, equiva-
lently, minimizes the error energy E =
P
b
e
b
K
b
. Com-
pared to generator-based decoding, an obvious advantage
is that there is no need to go over all 2
2k
logical oper-
ators. Unfortunately, for generic codes, even the rela-
tively simple problem of ME decoding has an exponential
complexity[56]. Given that the intermediate codes in the
sequence (31) tend to be long, this makes it unpractical
to use generic ME decoding algorithms with exponential
complexity, e.g., the information subset[57, 58] or the
classical Viterbi[59] algorithm.
Notice, however, that the sparsity of parity-check ma-
trices H
(j)
increase with j. The final matrix H
(s)
is ex-
pected to be sparse whenever the output code is an LDPC
code. Ideally, this classical code could be decoded with a
linear complexity using a variant of belief propagation
(BP) algorithm[60–63]. Assuming a sufficiently small
fraction of failed convergence cases, the result would be
equivalent to ML decoding of the correlated errors.
Unfortunately, this does not look so promising given
the fact that this final code is expected to have an expo-
nential length, see Eq. (29). Additionally, as confirmed
with limited numerical simulations[64], having a large
number of small in magnitude LLR coefficients tends to
reduce the convergence rate. Using approximate decod-
ing schemes with reduced number of LLR coefficients as
discussed in Sec. V B is expected to help with both is-
sues. Notice, however, that for such a reduction certain
columns in the generator matrix Θ are merely dropped
(puncturing). The corresponding transformation for the
parity-check matrix H is shortening[65], which may re-
duce the sparsity of the resulting matrix and, in turn,
negatively affect the convergence of BP decoding.
VI. DISCUSSION
Improving the accuracy of syndrome-based decoding in
the presence of circuit-level error correlations would both
increase the threshold to scalable quantum computation
and improve the finite-size performance of quantum er-
ror correction. Present results make two steps in this
direction. First, the observation that ML decoding un-
der these conditions amounts to decoding the code[21, 22]
associated with the circuit EEG, in the absence of cor-
relations. Second, the structure of this latter code can
be significantly simplified using row-reduction transfor-
mations, while leaving the probability of ML decoding
unchanged. A variety of approximate decoding schemes
naturally follow, which interpolate between the exact ML
decoding and the decoding within a relatively simple phe-
nomenological error model, with an additional handle on
the degeneracy of the actual code to be used for decoding.
Designing practical decoding algorithms, e.g., in appli-
cation to surface codes with well-developed near-optimal
syndrome measurement circuits, would require a substan-
tial additional effort. However, this does not look like an
unsolvable problem.
Decoding quantum LDPC codes is a major problem in
the theory of QECCs, especially in a fault-tolerant regime
with syndrome measurement errors present. While a
substantial progress has been made in recent years[66–
70], this problem remains open in application to generic
highly-degenerate codes. Transformations discussed in
Sec. IV change the degeneracy of a quantum code, and
can even map from a quantum to a classical code. This
opens new avenues to explore in decoding, in particular,
new ways to apply existing iterative decoding algorithms
to highly degenerate codes.
Circuit optimization: Traditional approach to quan-
tum error correction is to start with a code, come up
with an FT measurement circuit, compile it to a set of
gates available on a specific quantum computer, and then
finally design a decoder. Instead, one could start with the
list of permitted two-qubit gates on a particular device
and enumerate all good error-detecting circuits, increas-
ing circuit depth and the number of gates. Given the cir-
cuit, it is easy to find the parameters of the input/output
codes, as well as construct the associated circuit code.
While at the end we would still need to evaluate and
compare the performance of thus constructed codes, such
a procedure could offer a shortcut to circuit optimization
for specific hardware.
Acknowledgment: This work was supported in part
by the NSF Division of Physics via grant No. 1820939.
13
Appendix A: Ranks of the matrices
1. Rank of the circuit-EEG generator matrix
Consider a good error-detecting circuit in Fig. 1 with
the constant c = 1/2
κ
in Eq. (11) and the input (or out-
put) code encoding k = n
0
r
0
qubits, where r
0
is the
rank of the input-code stabilizer group. Here κ is an
integer[22]. Also assume that f measurements are re-
dundant, so that the total number of ancillary qubits is
n
a
= κ + r
0
+ f . Generators of the circuit EEG can be
used to propagate any circuit error all the way to the
output layer on the right, which requires 2N 2(n
0
+n
a
)
independent generators, where N is the number of circuit
locations. In addition, there are n
a
ancillary Z
j
gener-
ators on the input and n
a
on the output layers. How-
ever, not all of them are independent. Indeed, when
the n
a
ancillary Z
j
operators are propagated to the
right, we get r
0
independent operators, each containing
a product of ancillary Z
j
and an element of the stabi-
lizer group, κ independent operators containing ancillary
X
j
, and f additional combinations that are redundant
except for a product of ancillary Z
j
. Overall, this gives
rank G = 2N 2(n
0
+ n
a
) + n
a
+ κ + r
0
, which is the
same as in Eq. (12).
2. Rank of the circuit stabilizer group
The circuit stabilizer group must commute with any
EEG generator and also with any logical operator of
the output code (say). Necessarily, its elements must
be spackles[22]. Any spackle is uniquely determined by
its support on the input layer, thus, there are total of
2(n
0
+ n
a
) independent spackles. We also have to en-
sure commutativity with ancillary Z
j
generators on the
left (drop n
a
spackles) and on the right (drop κ spack-
les). Additional r
0
spackles have to be removed to en-
sure commutativity with the elements of the output code
stabilizer generators, and 2k spackles to ensure commu-
tativity with the output code logical operators. Overall,
this leaves
rank H = 2(n
0
+ n
a
) n
a
κ r
0
2(n
0
r
0
)
= n
a
κ + r
0
,
which is exactly the result in Eq. (14).
[1] P. W. Shor, “Scheme for reducing decoherence in quan-
tum computer memory,” Phys. Rev. A 52, R2493 (1995).
[2] C. G. Almudever, L. Lao, X. Fu, N. Khammassi,
I. Ashraf, D. Iorga, S. Varsamopoulos, C. Eichler,
A. Wallraff, L. Geck, A. Kruth, J. Knoch, H. Bluhm,
and K. Bertels, “The engineering challenges in quantum
computing,” in Design, Automation Test in Europe Con-
ference Exhibition (DATE), 2017 (2017) pp. 836–845.
[3] P. Aliferis, D. Gottesman, and J. Preskill, “Quan-
tum accuracy threshold for concatenated distance-3
codes,” Quantum Inf. Comput. 6, 97–165 (2006), quant-
ph/0504218.
[4] David S. Wang, Austin G. Fowler, and Lloyd C. L. Hol-
lenberg, “Surface code quantum computing with error
rates over 1%,” Phys. Rev. A 83, 020302 (2011).
[5] Christopher T. Chubb and Steven T. Flammia, “Statisti-
cal mechanical models for quantum codes with correlated
noise,” (2018), unpublished, 1809.10704.
[6] E. Dennis, A. Kitaev, A. Landahl, and J. Preskill,
“Topological quantum memory,” J. Math. Phys. 43, 4452
(2002).
[7] Austin G. Fowler, Adam C. Whiteside, and Lloyd C. L.
Hollenberg, “Towards practical classical processing for
the surface code,” Phys. Rev. Lett. 108, 180501 (2012).
[8] A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N.
Cleland, “Surface codes: Towards practical large-scale
quantum computation,” Phys. Rev. A 86, 032324 (2012).
[9] Austin G. Fowler, Adam C. Whiteside, Angus L.
McInnes, and Alimohammad Rabbani, “Topological
code autotune,” Phys. Rev. X 2, 041003 (2012).
[10] Christopher Chamberland, Guanyu Zhu, Theodore J. Yo-
der, Jared B. Hertzberg, and Andrew W. Cross, “Topo-
logical and subsystem codes on low-degree graphs with
flag qubits,” Phys. Rev. X 10, 011022 (2020).
[11] Christopher Chamberland, Aleksander Kubica,
Theodore J Yoder, and Guanyu Zhu, “Triangular
color codes on trivalent graphs with flag qubits,” New
Journal of Physics 22, 023019 (2020).
[12] Christophe Vuillot, Lingling Lao, Ben Criger, Car-
men Garc´ıa Almud´ever, Koen Bertels, and Barbara M.
Terhal, “Code deformation and lattice surgery are gauge
fixing,” New Journal of Physics 21, 033028 (2019).
[13] Giacomo Torlai and Roger G. Melko, “Neural decoder for
topological codes,” Phys. Rev. Lett. 119, 030501 (2017).
[14] S. Krastanov and L. Jiang, “Deep neural network prob-
abilistic decoder for stabilizer codes,” Scientific Reports
7, 11003 (2017), 1705.09334.
[15] N. P. Breuckmann and X. Ni, “Scalable neural network
decoders for higher dimensional quantum codes,” Quan-
tum 2, 68 (2018), 1710.09489.
[16] Zhih-Ahn Jia, Yuan-Hang Zhang, Yu-Chun Wu, Liang
Kong, Guang-Can Guo, and Guo-Ping Guo, “Efficient
machine-learning representations of a surface code with
boundaries, defects, domain walls, and twists,” Phys.
Rev. A 99, 012307 (2019).
[17] Paul Baireuther, Thomas E. O’Brien, Brian Tarasin-
ski, and Carlo W. J. Beenakker, “Machine-learning-
assisted correction of correlated qubit errors in a topo-
logical code,” Quantum 2, 48 (2018).
[18] Christopher Chamberland and Pooya Ronagh, “Deep
neural decoders for near term fault-tolerant experi-
ments,” Quantum Science and Technology 3, 044002
(2018).
[19] P. Baireuther, M. D. Caio, B. Criger, C. W. J. Beenakker,
and T. E. O’Brien, “Neural network decoder for topolog-
ical color codes with circuit level noise,” New Journal of
14
Physics 21, 013003 (2019).
[20] Nishad Maskara, Aleksander Kubica, and Tomas
Jochym-O’Connor, “Advantages of versatile neural-
network decoding for topological codes,” Phys. Rev. A
99, 052351 (2019).
[21] D. Bacon, S. T. Flammia, A. W. Harrow, and J. Shi,
“Sparse quantum codes from quantum circuits,” in Pro-
ceedings of the Forty-Seventh Annual ACM on Sympo-
sium on Theory of Computing, STOC ’15 (ACM, New
York, NY, USA, 2015) pp. 327–334, 1411.3334.
[22] D. Bacon, S. T. Flammia, A. W. Harrow, and J. Shi,
“Sparse quantum codes from quantum circuits,” IEEE
Transactions on Information Theory 63, 2464–2479
(2017).
[23] Jozef Streˇcka, “Generalized algebraic transformations
and exactly solvable classical-quantum models,” Physics
Letters A 374, 3718 3722 (2010).
[24] Christopher Chamberland and Michael E. Beverland,
“Flag fault-tolerant error correction with arbitrary dis-
tance codes,” Quantum 2, 53 (2018), 1708.02246.
[25] C. Chamberland and A. W. Cross, “Fault-tolerant magic
state preparation with flag qubits,” Quantum 3, 143
(2019), 1811.00566.
[26] Rui Chao and Ben W. Reichardt, “Quantum error correc-
tion with only two extra qubits,” Phys. Rev. Lett. 121,
050502 (2018).
[27] H´ector Bomb´ın, “Single-shot fault-tolerant quantum er-
ror correction,” Phys. Rev. X 5, 031043 (2015).
[28] Benjamin J. Brown, Naomi H. Nickerson, and Dan E.
Browne, “Fault-tolerant error correction with the gauge
color code,” Nature Communications 7, 12302 (2016).
[29] Earl T. Campbell, “A theory of single-shot error correc-
tion for adversarial noise,” Quantum Science and Tech-
nology 4, 025006 (2019), 1805.09271.
[30] I. Dumer, A. A. Kovalev, and L. P. Pryadko, “Thresh-
olds for correcting errors, erasures, and faulty syndrome
measurements in degenerate quantum codes,” Phys. Rev.
Lett. 115, 050502 (2015), 1412.6172.
[31] A. A. Kovalev, S. Prabhakar, I. Dumer, and L. P.
Pryadko, “Numerical and analytical bounds on threshold
error rates for hypergraph-product codes,” Phys. Rev. A
97, 062320 (2018), 1804.01950.
[32] David Poulin, “Stabilizer formalism for operator quan-
tum error correction,” Phys. Rev. Lett. 95, 230504
(2005).
[33] Dave Bacon, “Operator quantum error-correcting subsys-
tems for self-correcting quantum memories,” Phys. Rev.
A 73, 012340 (2006).
[34] Daniel Gottesman, Stabilizer Codes and Quantum Error
Correction, Ph.D. thesis, Caltech (1997).
[35] A. R. Calderbank, E. M. Rains, P. M. Shor, and
N. J. A. Sloane, “Quantum error correction via codes
over GF(4),” IEEE Trans. Info. Theory 44, 1369–1387
(1998).
[36] Jeroen Dehaene and Bart De Moor, “Clifford group, sta-
bilizer states, and linear and quadratic operations over
GF(2),” Phys. Rev. A 68, 042318 (2003).
[37] Scott Aaronson and Daniel Gottesman, “Improved sim-
ulation of stabilizer circuits,” Phys. Rev. A 70, 052328
(2004).
[38] Bin Dai, Shilin Ding, and Grace Wahba, “Multivariate
Bernoulli distribution,” Bernoulli 19, 1465–1483 (2013).
[39] F. Wegner, “Duality in generalized Ising models and
phase transitions without local order parameters,” J.
Math. Phys. 2259, 12 (1971).
[40] A. J. Landahl, J. T. Anderson, and P. R. Rice, “Fault-
tolerant quantum computing with color codes,” (2011),
presented at QIP 2012, December 12 to December 16,
arXiv:1108.5738.
[41] A. A. Kovalev and L. P. Pryadko, “Spin glass re-
flection of the decoding transition for quantum error-
correcting codes,” Quantum Inf. & Comp. 15, 0825
(2015), arXiv:1311.7688.
[42] Lars Onsager, “Crystal statistics. I. a two-dimensional
model with an order-disorder transition,” Phys. Rev. 65,
117–149 (1944).
[43] Shigeo Naya, “On the spontaneous magnetizations of
honeycomb and Kagom´e Ising lattices,” Progress of The-
oretical Physics 11, 53–62 (1954).
[44] Michael E. Fisher, “Transformations of Ising models,”
Phys. Rev. 113, 969–981 (1959).
[45] Sergey Bravyi, Martin Suchara, and Alexander Vargo,
“Efficient algorithms for maximum likelihood decoding
in the surface code,” Phys. Rev. A 90, 032326 (2014).
[46] Markus Hauru, Clement Delcamp, and Sebastian Miz-
era, “Renormalization of tensor networks using graph-
independent local truncations,” Phys. Rev. B 97, 045111
(2018).
[47] M. de Koning, Wei Cai, A. Antonelli, and S. Yip,
“Efficient free-energy calculations by the simulation of
nonequilibrium processes,” Computing in Science Engi-
neering 2, 88–96 (2000).
[48] Charles H. Bennett, “Efficient estimation of free energy
differences from Monte Carlo data,” Journal of Compu-
tational Physics 22, 245268 (1976).
[49] Tobias Preis, Peter Virnau, Wolfgang Paul, and Jo-
hannes J. Schneider, {GPU} accelerated monte carlo
simulation of the 2d and 3d ising model,” Journal of Com-
putational Physics 228, 4468 4477 (2009).
[50] A. Gilman, A. Leist, and K. A. Hawick, “3D lat-
tice Monte Carlo simulations on FPGAs,” in Proceed-
ings of the International Conference on Computer De-
sign (CDES) (The Steering Committee of The World
Congress in Computer Science, Computer Engineering
and Applied Computing (WorldComp), 2013).
[51] Kun Yang, Yi-Fan Chen, Georgios Roumpos, Chris
Colby, and John Anderson, “High performance Monte
Carlo simulation of Ising model on TPU clusters,”
(2019), unpublished, 1903.11714.
[52] D. Poulin and Y. Chung, “On the iterative decoding of
sparse quantum codes,” Quant. Info. and Comp. 8, 987
(2008), arXiv:0801.1241.
[53] Ye-Hua Liu and David Poulin, “Neural belief-
propagation decoders for quantum error-correcting
codes,” Phys. Rev. Lett. 122, 200501 (2019), 1811.07835.
[54] Alex Rigby, J. C. Olivier, and Peter Jarvis, “Modi-
fied belief propagation decoders for quantum low-density
parity-check codes,” Phys. Rev. A 100, 012330 (2019),
1903.07404.
[55] A. A. Kovalev, I. Dumer, and L. P. Pryadko, “Design
of additive quantum codes via the code-word-stabilized
framework,” Phys. Rev. A 84, 062319 (2011).
[56] Pavithran Iyer and David Poulin, “Hardness of decoding
quantum stabilizer codes,” IEEE Transactions on Infor-
mation Theory 61, 5209–5223 (2015), arXiv:1310.3235.
[57] E. A. Kruk, “Decoding complexity bound for linear block
codes,” Probl. Peredachi Inf. 25, 103–107 (1989), (In
Russian).
15
[58] J. T. Coffey and R. M. Goodman, “The complexity of
information set decoding,” IEEE Trans. Info. Theory 36,
1031 –1037 (1990).
[59] Andrew J. Viterbi, “Error bounds for convolutional codes
and an asymptotically optimum decoding algorithm,”
IEEE Transactions on Information Theory 13, 260–269
(1967).
[60] R. G. Gallager, Low-Density Parity-Check Codes (M.I.T.
Press, Cambridge, Mass., 1963).
[61] M. P. C. Fossorier, “Iterative reliability-based decoding
of low-density parity check codes,” IEEE Journal on Se-
lected Areas in Communications 19, 908–917 (2001).
[62] Thomas J. Richardson and R¨udiger L. Urbanke, “The ca-
pacity of low-density parity-check codes under message-
passing decoding,” Information Theory, IEEE Transac-
tions on 47, 599–618 (2001).
[63] David Declerq, Marc Fossorier, and Ezio Biglieri, eds.,
Channel Coding. Theory, Algorithms, and Applications
(Academic Press