Product Decomposition of Periodic Functions in Quan-
tum Signal Processing
Jeongwan Haah
Microsoft Quantum and Microsoft Research, Redmond, Washington, USA
20 Septemb er 2019
We consider an algorithm to approximate complex-valued periodic functions
f(e
) as a matrix element of a product of SU(2)-valued functions, which under-
lies so-called quantum signal processing. We prove that the algorithm runs in
time O(N
3
polylog(N/)) under the random-access memory model of computation
where N is the degree of the polynomial that approximates f with accuracy ;
previous eﬃciency claim assumed a strong arithmetic model of computation and
lacked numerical stability analysis.
1 Introduction
Quantum signal processing [13] refers to a scheme to construct an operator V from a more
elementary unitary W where V =
P
θ
f(e
) |θihθ| and W = e
|θihθ| share the eigenvectors
but the eigenvalues of V are transformed by a function f from those of W . The transformation
requires only one ancilla qubit, and is achieved by implementing control-W and control-W
,
interspersed by single-qubit rotations on the control, and ﬁnal post-selection on the control.
1
This technique produced gate-eﬃcient quantum algorithms for, e.g., Hamiltonian simulations,
which is asymptotically optimal when the Hamiltonian is sparse and given as a blackbox, or
as a linear combination of oracular unitaries [5, 6]. Furthermore, this technique with rigorous
error bounds appears to be useful and competitive even for explicitly described, rather than
oracular, local Hamiltonian simulation problems [7, 8]. It is also promised to be useful in
solving linear equations [3, 4, 9].
However, in quantum signal processing the classical preprocessing to ﬁnd interspersing
single-qubit rotations for a given transformation function f has been so numercially unstable
that it has been unclear whether it can be performed eﬃciently. In fact, Ref. [7, App. H.3]
reports that the computation time is “prohibitive” to obtain sequences of length & 30 of
interspersing unitaries for Jacobi-Anger expansions that we explain in Section 5. The true
usefulness of quantum signal processing hinges upon the ability to compute long sequences of
interspersing single-qubit rotations.
It has been asserted that there exists a polynomial time classical algorithm in Refs. [2, 4],
but these work do not consider numerical instability. If a computational model assumes
1
Sometimes it is possible to avoid controlled version of W [3, 4], but we contend ourselves with this
implementation for its simplicity in presentation. The result of this paper is applicable for the ancilla-free
variant; the only change one may have to do is to replace a variable e
iθ/2
with e
.
Accepted in Quantum 2019-09-27, click title to verify 1
arXiv:1806.10236v3 [quant-ph] 27 Sep 2019
that any real arithmetic with arbitrarily high precision can be done in unit time, then an
unavoidable conclusion is that not only can one factor integers in time that is linear in the
number of digits [10], but also solve NP-hard problems in polynomial time [11].
2
In a seemingly
mundane problem involving real numbers, the number of required bits during the computation
can be a priori very large. For example, it is still an open problem whether one can decide the
larger between
P
k
j=1
a
j
and
P
k
j=1
p
b
j
, which are sums of square roots of positive integers
a
j
, b
j
that are smaller than n, in time poly(k log n) on a Turing machine; see e.g. [12, 13] for
recent results.
The numerical instability of previous methods may be attributed to expansions of large
degree polynomials that are found by roots of another polynomial. Crudely speaking, there
are two problems in this approach. First, the polynomial expansions can be regarded as
the computation of convolutions,
3
which, when naively iterated, may suﬀer from numerical
instability. Second, although the root ﬁnding is a well-studied problem, to use the roots to
construct another polynomial one has to understand the distribution of the roots to keep
track of the loss of precision. These problems were not addressed previously.
Here, we reﬁne a classical algorithm to ﬁnd interspersing single-qubit rotations and bound
the number of required bits in the computation for a desired accuracy to a ﬁnal answer. We
conclude that the classical preprocessing can indeed be done in polynomial time on a Tur-
ing machine. We generally adopt the methods of Ref. [2], but make manipulations easier by
avoiding trigonometric polynomials. Some generalizations are obtained with simpler calcula-
tions. For the numerical stability and analysis, our algorithm avoids too small numbers by
sacriﬁcing approximation quality in the initial stage, and replaces polynomial expansions by a
Fourier transform. These modiﬁcations enable us to handle the problems that are mentioned
above. However, it should be made clear that our reﬁnement also requires high precision
arithmetic. Speciﬁcally, we show that O(N log(N/)) bits of precision during the execution
of our algorithm is suﬃcient to produce a reliable ﬁnal decomposition, where N is the degree
of the polynomial that approximates a given transformation function f of eigenvalues up to
additive error . Previously no such bound was known. On a sequential (rather than parallel)
random access machine, our algorithm runs in time O(N
3
polylog(N/)). In our rudimen-
tary numerical experiment we were able to produce a sequence of length over 2000 for the
Jacobi-Anger expansion on a laptop by a few hours of computation.
We will start by reviewing quantum signal processing in the next section, and then develop
an algorithm and analyze it. In two short sections later we provide self-contained treatment
of polynomial approximations for Hamiltonian simulation and matrix inversion problems.
The section on Hamiltonian simulation contains some running time data of our algorithm.
Throughout the paper, we use U (1) to denote the group of all complex numbers of unit
modulus. Sometimes we will refer to U(1) as the unit circle. As usual, i =
1, and
X = |1ih0| + |0ih1|, Y = i |1ih0| i |0ih1|, Z = |0ih0| |1ih1| are Pauli matrices.
2
The basic idea in Ref. [11] is to relate the expansion of a Boolean formula in a conjunctive form into a
disjunctive form with the expansion of a polynomial from its factors. Then, the satisﬁability translates to the
positiveness of certain coeﬃcients in the expansion. The variable in the polynomial is set to a suﬃciently large
number, eﬀectively encoding the entire polynomial in a single big number. An important elementary fact that
underlies the power of arithmetic model is that we only need O(n) compositions of squaring operation x 7→ x
2
to reach a 2
n
-bit number.
3
A polynomial
P
j
a
j
t
j
can be identiﬁed with a “coeﬃcient function” j 7→ a
j
. The coeﬃcient function of
the product of two polynomials is the convolution of the two coeﬃcient functions.
Accepted in Quantum 2019-09-27, click title to verify 2
2 Quantum Signal Processing
To understand how the eigenvalue transformation (signal processing) works, it is convenient
to consider the action of control-W restricted to an arbitrary but ﬁxed eigenstate |θi of W .
The eigenvalue e
associated with |θi is kicked back to the control qubit to induce a unitary
|0ih0| + e
|1ih1| on the control qubit. Conjugating the control-W by a single qubit unitary
on the control qubit, we see that |0i and |1i can be any orthonormal basis vectors |0
0
i, |1
0
i of
the control qubit. If we allow ourselves to implement the inverse of the control-W , which is
reasonable, we can also implement
|0
0
ih0
0
| + e
|1
0
ih1
0
|

|0
00
ih0
00
| + e
|1
00
ih1
00
|
(1)
=
e
iθ/2
|0
0
ih0
0
| + e
iθ/2
|1
0
ih1
0
|
e
iθ/2
|0
00
ih0
00
| + e
iθ/2
|1
00
ih1
00
|
where {|0
0
i, |1
0
i} and {|0
00
i, |1
00
i} are arbitrary orthonormal bases.
4
When we alternate an
even number of control-W and control-W
, this trick allows us to assume that an imple-
mentable unitary on the control qubit is a product of primitive matrices
E
P
(t) = tP + t
1
(I P ) = tP + t
1
Q (2)
where t = e
iθ/2
and P = I Q is a projector of rank 1. Thus, an even number n of control-W
and control-W
, together with an extra unitary E
0
independent of t, induces
F (t) = E
0
E
P
1
(t)E
P
2
(t) ···E
P
n
(t) (3)
on the control qubit. The product F (t) can be thought of as an SU (2)-valued function over
the unit circle in the complex plane. By the same formulas, we deﬁne E
P
(t) and F (t) on the
entire complex plane except the origin t = 0. Now if h+|F (t) |+i is close to f(t
2
= e
) for
all t U(1), then post-selection on |+i of the control qubit enacts V . Here, the choice of
|+i is a convention, as it can be any other state due to E
0
. The success probability of the
post-selection depends on the magnitude of f(e
). A natural question is then what F (t) is
achievable in the form of Eq. (3). Note that it makes no diﬀerence to insert many unitaries
that are independent of t in between E
P
j
(t)’s, rather than a single E
0
at the front, because
UE
P
(t)U
= E
UP U
(t) for any unitary U. The answer to the achievability question turns out
to be quite simple, as we show in the next section.
3 Polynomial functions U (1) SU (2)
Deﬁnition 1. For any integer n 0, let P
n
be the set of all Laurent polynomials F (t) =
P
n
j=n
C
j
t
j
in t with coeﬃcients C
j
in 2-by-2 complex matrices, such that F (η) SU(2) for
all complex numbers η of unit modulus. We say that F (t) P
n
has degree n if C
n
6= 0 or
C
n
6= 0. We deﬁne E
n
to be the subset of P
n
consisting of all F (t) where the exponents
of t in F (t) belong to {−n, n + 2, n + 4, . . . , n 2, n}. Note that P
0
= E
0
= SU(2), and
4
The square root function e
7→ e
iθ/2
has a branch cut, but it hardly matters to us as long as we are
consistent that e
iθ/2
denotes the inverse of the square root.
Accepted in Quantum 2019-09-27, click title to verify 3
for any orthogonal projector P we have E
P
(t) E
1
. We deﬁne F
(t) to be
P
j
C
j
t
j
.
5
In a
set-theoretic notation, the deﬁnitions are as the following.
P
n
=
F (t) =
n
X
j=n
C
j
t
j
Mat(2; C)[t, t
1
]
η C, |η| = 1 F (η) SU(2)
(4)
E
n
=
F (t) =
n
X
j=n
C
j
t
j
P
n
k 2Z + 1, C
n+k
= 0
(5)
Note that for any F (t) P
n
, we have F (t)F
(1/t) = F
(1/t)F (t) = I; this is true for
every t on the unit circle, an inﬁnite set, and any (Laurent) polynomial is determined by its
values on an inﬁnite set.
Theorem 2. Any n-fold product E
P
1
(t) ···E
P
n
(t) belongs to E
n
. Conversely, every F (t) E
n
of degree n has a unique decomposition into primitive matrices and a unitary, as in Eq. (3).
If F (t) = C
n
t
n
+ ··· + C
n
t
n
, then P
n
= C
n
C
n
/ Tr(C
n
C
n
).
This completely characterizes polynomial functions U (1) SU(2) and covers all previous
results on “achievable functions” in quantum signal processing [2, 4]. Indeed, for any Laurent
polynomial function U(1) 3 z 7→ F (z) SU (2), the Laurent polynomial function t 7→ F (t
2
)
belongs to E
n
for some n and has a unique product decomposition of the theorem. Our version
is slightly more general since previous results implicitly assume that Tr(P
j
Z) = 0.
Proof. The ﬁrst statement is trivial by deﬁnition. The proof of the converse is by induction
in n where the base case n = 0 is trivial. The induction step is proved as follows. We are
going to prove that for any F (t) E
n
of degree n > 0 there exists a unique E
K
(t) such that
F (t)E
K
(t) E
n1
.
6
Consider F (t) =
P
n
j=n
C
j
t
j
as a 2-by-2 matrix of four Laurent polynomials. The deﬁning
property det F (t) = 1 holds for inﬁnitely many values of t, and therefore it should hold as a
polynomial equation. Taking the leading term, we have t
2n
det C
n
+ (lower order terms) = 1.
Similarly, taking the leading term in t
1
, we have t
2n
det C
n
+ (higer order terms) = 1.
Hence,
det C
n
= 0 = det C
n
. (6)
Similarly, from the equation F
(1/t)F (t) = I = F (t)F
(1/t) we have t
2n
C
n
C
n
+ O(t
2n1
) =
I = t
2n
C
n
C
n
+ O(t
2n1
) and hence
C
n
C
n
= 0 = C
n
C
n
. (7)
Since the degree of F (t) is n, at least one of C
n
and C
n
is nonzero. Suppose C
n
6= 0.
Let K be a rank-1 projector such that C
n
K = 0, and let L = I K. Such K is unique
since C
n
is a two-by-two matrix of rank one; the singular value decomposition of C
n
is |aihb|
5
For an indeterminant t, we do not deﬁne (F (t))
. For a unimodular complex number η, the hermitian
conjugate of the unitary F (η) is (F (η))
= F
(¯η) = F
(1) which is not equal in general to F
(η).
6
A reader might ﬁnd it unusual that the degree of a polynomial is decreasing under multiplication, but in
the algebra of matrices two nonzero matrices may multiply to vanish.
Accepted in Quantum 2019-09-27, click title to verify 4
for some unnormalized vectors |ai, |bi, and K has to annihilate |bi. Then, we claim that
F (t)(tK + t
1
L) E
n1
. Indeed, expanding the left-hand side we have
t
n1
C
n
L + t
n+1
(C
n
K + C
n+2
L) + ··· + t
n1
(C
n
L + C
n2
K) + t
n+1
C
n
K. (8)
(This is the only place we use E instead of P.) If C
n
= 0, this implies the claim. If
C
n
6= 0, then, considering the singular value decomposition of C
n
, we have K C
n
C
n
and therefore C
n
L = 0, implying the claim. The case C
n
6= 0 is completely parallel.
Actually, for any F (t) E
n
of degree n, C
n
6= 0 if and only if C
n
6= 0: C
n
is a product
of n rank-one operators E
0
P
1
···P
n
= E
0
|p
1
ihp
1
|p
2
i···hp
n1
|p
n
ihp
n
|, where hp
j
|p
j+1
i 6= 0
for all j, and this implies Q
j
Q
j+1
= (I P
j
)(I P
j+1
) 6= 0 for all j, which is to say that
C
n
= E
0
Q
1
···Q
n
6= 0.
3.1 Parity constraints
Any member of SU(2) can be written as aI +biX +ciY + diZ where the real numbers a, b, c, d
satisfy a
2
+ b
2
+ c
2
+ d
2
= 1, and this decomposition is unique. (The group SU(2) is identiﬁed
with the group of all unit quaternions.) Thus, a member F (z) P
n
can be written uniquely as
F (z) = a(z)I + b(z)iX + c(z)iY + d(z)iZ. Here, a(z), b(z), c(z), d(z) are Laurent polynomials
such that a(z)
2
+ b(z)
2
+ c(z)
2
+ d(z)
2
= 1, and each takes real values on U(1).
Recall that under the standard representation of Pauli matrices, Z is diagonal, and X, Y
are oﬀ-diagonal. Suppose an SU(2)-valued function θ 7→ F (e
) = a(e
)I + b(e
)iX +
c(e
)iY + d(e
)iZ has even functions (reciprocal in t = e
) in the diagonal and odd (anti-
reciprocal in t = e
) in the oﬀ-diagonal. That is, a(t) = a(1/t), d(t) = d(1/t), b(t) = b(1/t),
and c(t) = c(1/t). We claim that if F (t) is such an element of E
n
, then the primitive
matrix E
P
n
(t) factored from F (t) by Theorem 2 has a property that Tr(ZP
n
) = 0. Since
for any projector P =
1
2
(I + p
x
X + p
y
Y + p
z
Z) where (p
x
, p
y
, p
z
) R
3
has norm 1, the
primitive matrix E
P
(t) equals tP + t
1
(I P ) =
t+t
1
2
I +
tt
1
2
(p
x
X + p
y
Y + p
z
Z), the
condition Tr(ZP ) = 0 is to say p
z
= 0. This implies that E
P
n
(t) has reciprocal diagonal and
anti-reciprocal oﬀ-diagonal. To prove the claim we observe that
Z = F (t)F
(1/t)Z
= F (t)(a(1/t) b(1/t)iX c(1/t)iY d(1/t)iZ)Z
= F (t)(a(t) + b(t)iX + c(t)iY d(t)iZ)Z (9)
= F (t)Z(a(t) b(t)iX c(t)iY d(t)iZ)
= F (t)ZF
(t).
It follows that 0 = Tr(Z) = Tr(F (t)ZF
(t)) = t
2n
Tr(C
n
ZC
n
) + ··· + t
2n
Tr(C
n
ZC
n
) as
a polynomial in t, and hence Tr(C
n
C
n
Z) = 0 = Tr(C
n
C
n
Z) when n > 0. The matrix
C
±n
C
±n
is proportional to the projector P
n
or I P
n
in the factor E
P
n
(t) as shown in
Theorem 2, and therefore we have Tr(ZP
n
) = 0.
Moreover, it then follows that F (t)E
P
n
(1/t) also has reciprocal diagonal and anti-reciprocal
oﬀ-diagonals. Therefore, the unique projectors P
1
, . . . , P
n
in the decomposition, which deﬁne
the primitive matrices E
P
j
(t), have zero Z-component. This means that all projectors P
j
are
of form
P
j
= e
iZφ
j
/2
|+ih+|e
iZφ
j
/2
(10)
Accepted in Quantum 2019-09-27, click title to verify 5
where φ
j
R is some angle. In fact, Ref. [2] exclusively considered E
P
(t) of this form. This
is contrasted to the general case where P
j
is identiﬁed with a point on the Bloch sphere. The
constraint Tr(ZP
j
) = 0 forces P
j
to lie on the equator of the Bloch sphere.
Note that E
0
, the residual SU(2) factor in the decomposition, has generally nonzero Z-
component. Since E
P
(1) = I for any projector P , we know E
0
= F (1) = a(1)I + b(1)iX +
c(1)iY + d(1)iZ, but b(1) = c(1) = 0 due to their anti-reciprocity. This implies that E
0
=
e
iZφ
0
/2
for some angle φ
0
. Hence, under the parity constraint of this subsection, F (t) E
n
is
uniquely speciﬁed by n + 1 angles φ
0
, φ
1
, . . . , φ
n
. Note that if d(1)
2
= 1 a(1)
2
= O(), then
d(1) = O(
) and kE
0
Ik = O(
). Hence, there would be quadratic loss in accuracy to
omit E
0
(i.e., to set φ
0
= 0), even though a(1) 1 suggests that one would not need E
0
.
3.2 Complementing polynomials
Quantum signal processing does not use F (t) itself, but rather a certain matrix element of it.
Hence, it is important to know what a matrix element can be. Let us ﬁrst introduce classes
of Laurent polynomials.
Deﬁnition 3. A (Laurent) polynomial with real R coeﬃcients is called a real (Laurent)
polynomial. The degree of a Laurent polynomial is the maximum absolute value of the
exponent of the variable whose coeﬃcient is nonzero. A Laurent polynomial f(z) is reciprocal
if f(z) = f(1/z), or anti-reciprocal if f(z) = f(1/z). A Laurent polynomial function
f : C \ {0} C is real-on-circle if f(z) R for all z C of unit modulus. A real-on-circle
Laurent polynomial f (z) is pure if f(z) is real reciprocal or if(z) is real anti-reciprocal.
The term “pure” is because a Laurent polynomial f (z) with complex coeﬃcients is real-
on-circle if and only if both
f
+
(z) :=
f(z) + f(1/z)
2
and f
(z) :=
f(z) f(1/z)
2i
(11)
are real Laurent polynomials. (Proof : Write f(e
) =
P
j
a
j
e
ijθ
, with the complex conjugate
being
P
j
a
j
e
ijθ
=
P
j
a
j
e
ijθ
. Thus, a
j
= a
j
, and the claim follows.) This simply rephrases
the fact that a real-valued function θ 7→ f(e
) has a trigonometric function series with real
coeﬃcients. Hence, for any real-on-circle Laurent polynomial f(z), it is real if and only if it
is reciprocal. In addition, a real and reciprocal Laurent polynomial is real-on-circle. That is,
among the three properties, real, real-on-circle, and reciprocal, any two imply the third. Note
that a real-on-circle Laurent polynomial is not necessarily real, and a real Laurent polynomial
is not necessarily real-on-circle. Also, note that real-on-circle Laurent polynomials form an
algebra over the real numbers.
We are now ready to state a suﬃcient condition under which a complex polynomial qualiﬁes
to be a matrix element of some F (t) P
n
. We think of a(z) and b(z) below as the real and
imaginary parts of a complex function, respectively. A reader might want to compare the
following lemma with the ﬁrst paragraph of Section 3.1.
Lemma 4. Let a(z) and b(z) be real-on-circle Laurent polynomials of degree at most n such
that a(η)
2
+ b(η)
2
< 1 for all η U (1). If a(z)
2
+ b(z)
2
is reciprocal (e.g., a(z) and b(z) are
pure), then there exist pure real-on-circle Laurent polynomials c(z) = c
+
(z) and d(z) = id
(z)
of degree at most n such that a(z)
2
+ b(z)
2
+ c(z)
2
+ d(z)
2
= 1.
Accepted in Quantum 2019-09-27, click title to verify 6
The conditions in the lemma on reciprocity are due to a technical reason in the proof.
If there were some other reason under which one can guarantee the existence of the comple-
menting polynomials, it would be possible to use them in the algorithm below, and the scope
of the input functions in our algorithm would be enlarged; the existence of the complementing
polynomials is more important than the reciprocity constraints. However, we note that the
reciprocity conditions are not severe restrictions since any periodic function is the sum of an
even and an odd function, and one can “combine” two functions by “ﬂexible” quantum signal
processing [14].
Proof. The Laurent polynomial 1 a(z)
2
b(z)
2
is reciprocal real of degree n
0
that is at most
2n; the leading terms of a(z)
2
and b(z)
2
might cancel each other so that n
0
< 2n. Due to the
reciprocity, there are 2n
0
roots in total with multiplicity taken into account and any root r
must come in a pair (z, z
1
) where one is inside the unit disk and the other outside the unit
disk, but neither is on the unit circle. We collect all the roots inside the unit disk:
D =
h
r C : 1 a(r)
2
b(r)
2
= 0, |r| < 1
i
. (12)
This is a list rather than a set as we take the multiplicities into account; D has exactly n
0
elements. Consider a factor of 1 a(z)
2
b(z)
2
:
e(z) = z
−bn
0
/2c
Y
r∈D
(z r). (13)
The monomial in front of the product is to balance the greatest exponent of z with the least
exponent; the degree of e(z) is dn
0
/2e. The list D is closed under complex conjugation due to
the reality of 1 a(z)
2
b(z)
2
, and hence e(z) is a real Laurent polynomial.
Then, the product e(z)e(1/z) is real reciprocal and has degree n
0
.
7
Now the two Laurent
polynomials e(z)e(1/z) and 1 a(z)
2
b(z)
2
have the same roots. Therefore they diﬀer by a
factor of cz
k
for some nonzero number c and an integer k, but the reciprocity ﬁxes k = 0 and
the reality puts c into R. That is,
α =
1 a(z)
2
b(z)
2
e(z)e(1/z)
R.
Evaluating this expression at z = 1, we see that α is positive. Thus, we ﬁnish the proof by
observing
1 a(z)
2
b(z)
2
= αe(z)e(1/z) =
e(z) + e(1/z)
2
α
2
+
e(z) e(1/z)
2i
α
2
. (14)
Both the reciprocal (c(z)) and anti-reciprocal (d(z)) combinations have degree dn
0
/2e
n.
Note that given a(z) and b(z) the complementing Laurent polynomials c(z), d(z) are not
unique in general. As long as the joint of a conjugate-closed list D and its reciprocal [1/r
C : r D] is the list of all the roots of 1 a(z)
2
b(z)
2
, we can construct c(z) and d(z)
satisfying the conditions in the lemma.
7
The degree of a Laurent polynomial in our deﬁnition is only subadditive under multiplication of two
Laurent polynomials. For example, the product (z 1)(z
1
2) has degree one.
Accepted in Quantum 2019-09-27, click title to verify 7
4 Eﬃcient implementation with bounded precision
In this section we consider an algorithm to ﬁnd interspersing single-qubit unitaries given a
complex function A(e
)+iB(e
). The algorithm consists of two main parts: ﬁrst, we have to
ﬁnd an SU(2)-valued function of ϕ such that a particular matrix element is the input function.
It suﬃces to ﬁnd a good approximation. Second, we have to decompose the SU(2)-valued
function into a product of primitive matrices. We have already given constructive proofs
for both the steps, but we tailor the construction so that numerical error is reduced and
traceable. We will outline our algorithm ﬁrst, deferring certain details to the next subsection.
The computational complexity will be analyzed subsequently.
Input: A real parameter (0,
1
100
), a list of 2N + 1 complex numbers ζ
k
(k = N, . . . , N)
speciﬁed using at most log
2
(100N/) bits in the ﬂoating point representation, and two
bits p
re
and p
im
.
Here, the list is the Fourier coeﬃcients ζ
k
for frequencies between N and N of
a complex-valued 2π-periodic function A(e
) + iB(e
) =
P
N
k=N
ζ
k
e
ikϕ
subject to
conditions that (i) each of real-valued functions A(e
) and B(e
) has deﬁnite par-
ity (even or odd parity as functions of ϕ), recorded in the two bits p
re
, p
im
, and (ii)
A(e
)
2
+ B(e
)
2
1 for any real ϕ.
The function ϕ 7→ A(e
) + iB(e
) must be suﬃciently close to an ultimate target
function, where the latter is, strictly speaking, not a part of the input for us. This
approximation has nothing to do with the algorithm below, but should be analyzed inde-
pendently for each quantum signal processing problem.
Output: A unitary E
0
SU(2) and an ordered list of 2 ×2 hermitian matrices P
1
, . . . , P
2n
where
n N.
Here, each P
m
is a (approximate) rank-one projector represented by Ω(log(N/)) bits
of precision and deﬁnes the primitive matrix E
m
(t) = tP
m
+ t
1
(I P
m
). When t =
e
U(1), it holds that
A(t
2
) + iB(t
2
) h+|E
0
E
1
(t) ···E
2n
(t) |+i
30.
Time: The computational time complexity is O(N
3
polylog(N/)) on a random-access memory
machine.
Alternatively, an input may be a list of function values from which Fourier coeﬃcients can
be computed. Having Fourier coeﬃcients for frequencies between N and N is equivalent to
having a Laurent polynomial A(z) + iB(z) of degree at most N .
Our Lemma 4 would allow more general input functions where the real and imaginary
parts are not necessarily of deﬁnite parity, but we restrict our algorithm to inputs of deﬁnite
parity, since we require A
2
+ B
2
to be of deﬁnite parity which may not be satisﬁed after the
rational approximation in Step 1 below if individual components are not of deﬁnite parity. As
mentioned before, this parity constraint is not too restrictive since quantum signal processing
can be easily adopted to handle functions of indeﬁnite parity [14].
4.1 Algorithm
1. Compute rational real-on-circle Laurent polynomials a(z), b(z) from (1 10)A(z), (1
10)B(z) by taking rational number approximation for each coeﬃcient up to an additive
Accepted in Quantum 2019-09-27, click title to verify 8
error of /N. If a coeﬃcient is smaller than /N in magnitude, then it must be replaced
by zero. The polynomials a(z) and b(z) are stored as lists of rational numbers (not
ﬂoating point numbers). This step in general decreases the Laurent polynomial degree
from N to n since some small coeﬃcients may be approximated by zero.
R
where R & 2N log
2
(N/), ﬁnd all roots of 1 a(z)
2
b(z)
2
.
See Eq. (37) for a rigorous bound on R. The roots are stored as ﬂoating point numbers.
From now on all real arithmetic will be performed using R-bit ﬂoating point numbers.
3. Evaluate the complementary polynomials computed from the roots of Step 2 according
to Lemma 4 at points of T where
T := {e
2πik/D
| k = 1, . . . , D} (15)
and D is a power of 2 that is larger than 2n + 1. One should not expand c(z), d(z)
before evaluation, but should substitute numerical values for z with accuracy 2
R
in
the factorized form of e(z) in Eq. (13), and then read oﬀ the real (c(z)) and imaginary
(d(z)) parts.
4. Set F (z) = a(z)I+b(z)iX+c(z)iY +d(z)iZ. Compute the discrete fast Fourier transform
of the function value list to obtain
C
(2n)
2j
=
Z
2π
0
dθ
2π
e
ijθ
F (e
) (16)
for j = n, n+1, . . . , n1, n. (In exact arithmetic, we would have F (z) =
P
n
j=n
C
(2n)
2j
z
j
.)
5. Set F
(2n)
(t) =
P
n
j=n
C
(2n)
2j
t
2j
. For m = 2n, 2n 1, . . . , 2, 1 sequentially in decreasing
order, (i) compute a primitive matrix E
m
(t) by
E
m
(t) = tP
m
+ t
1
(I P
m
) = t(I Q
m
) + t
1
Q
m
(17)
where P
m
=
C
(m)
m
C
(m)
m
Tr(C
(m)
m
C
(m)
m
)
, Q
m
=
C
(m)
m
C
(m)
m
Tr(C
(m)
m
C
(m)
m
)
,
and (ii) compute the coeﬃcient list of F
(m1)
(t) = F
(m)
(t)E
m
(1/t) by
C
(m1)
k
= C
(m)
k1
Q
m
+ C
(m)
k+1
P
m
(18)
where k = m + 1, m + 3, . . . , m 3, m 1.
6. Output E
0
= C
(0)
0
and P
1
, . . . , P
2n
using log
2
(20N/)-bit ﬂoating numbers. Then,
h+|E
0
E
1
(t) ···E
2n
(t) |+i (19)
is 30-close to A(t
2
) + iB(t
2
) for all t U(1).
Accepted in Quantum 2019-09-27, click title to verify 9
4.2 Further details and analysis
Step 1. Let the rational approximation be performed by truncating the binary expressions
of real numbers. To inherit parities, the rational approximation should be done only for terms
with nonnegative powers of z, from which we should infer the negative power terms. Then,
a(z) and b(z) satisfy
(i) a(z)
2
+ b(z)
2
is a real reciprocal Laurent polynomial,
(ii) |a(z) + ib(z) A(z) iB(z)| 26 for z U(1),
(iii) a(z)
2
+ b(z)
2
1 for z U(1), and
(iv) every nonzero coeﬃcient in a(z) or b(z) has magnitude /N.
The ﬁrst and the fourth conditions are clear by construction. The second condition is because
|A(z)a(z)| |A(z)(110)A(z)|+|(110)A(z)a(z)| 13 and similarly |B(z)b(z)|
13. The third is because |a(z)+ib(z)| |(110)(A(z)+iB(z))a(z)+ib(z)|+(110) 1.
The reason we speak of rational Laurent polynomials is mainly for the convenience of
analysis, as its evaluation can be made arbitrarily accurate since the coeﬃcients of a(z) and
b(z) are exact; each coeﬃcient is stored as a rational number, a pair of integers, rather than a
ﬂoating point number. In a concrete implementation of our algorithm, ﬂoating point numbers
that are carefully handled may substitute rational numbers. If µ = f × 2
d
is a real number
where
1
2
f < 1 and d Z, then the rational approximation of µ may be ˜µ = 0.b
1
b
2
···b
p
×2
d
for some p where b
j
are bits in the binary representation of f. Some care may be needed in
order not to discard any bits of ˜µ in arithmetic. For example, when ˜µ is added to another
ﬂoating number of p
0
> p bits of precision, then ˜µ should be mapped to a ﬂoating number of
whatever needed bits of precision by padding zeros. In practice, this should cause hardly any
complication since most high precision arithmetic libraries (e.g. The GNU Multiple Precision
Arithmetic Library) treat inputs as exact numbers, but control the precision of the result
according to other rules set by a user. The number of bits to represent all the rational
coeﬃcients is O(N log(N/)).
Step 2. There exists a root-ﬁnding algorithm with computational complexity
˜
O(n
3
+ n
2
R)
under the assumption that all the roots have modulus at most 1 [15]. In our case, the rational
Laurent polynomial p(z) = 1 a(z)
2
b(z)
2
does not satisfy the modulus condition; however,
this is a minor problem. Every coeﬃcient of p(z) is the Fourier coeﬃcient of the periodic
function p(e
) < 1, and hence is bounded by 1. By the condition (iv) of Step 1, the leading
coeﬃcient of p(z) is Ω((/N)
2
) in magnitude. (The reason is as follows. Since our rational
approximation is by truncating binary expressions of real numbers, the denominator of any
coeﬃcient is a power of 2 and is at most 2
dlog
2
(N/)e
. Hence, 4
dlog
2
(N/)e
p(z) has integer
coeﬃcients.) Say the polynomial p(z) = 1 a(z)
2
b(z)
2
has degree n
0
, which may be less
than 2n even if a(z) and b(z) have degree n. Converting p(z) into a monic polynomial q(z)
(after multiplying by z
n
0
and a normalization factor), we have q(z) = z
2n
0
+
P
2n
0
1
j=0
q
j
z
j
with
|q
j
| O((N/)
2
). Note that q(z) has (exactly represented) rational coeﬃcients. If q(z
0
) = 0
with |z
0
| > 1, then
|z
0
|
2n
0
2n
0
1
X
j=0
|q
j
||z
0
|
j
O(N
2
n
0
|z
0
|
2n
0
1
/
2
) (20)
Accepted in Quantum 2019-09-27, click title to verify 10
implying |z
0
| O(N
3
/
2
). (If |z
0
| 1, this is trivial.) We can use the algorithm of Ref. [15]
after rescaling z by a known factor. The overhead due to the potential loss of precision from
the rescaling is negligible since R = Ω(N log(N/)).
Step 3. We need to evaluate e(z) of Eq. (13) that is deﬁned by the roots found in Step 2.
For z U(1), the following Lemma 5 guarantees that any root of 1 a(z)
2
b(z)
2
is at
least /(4N
2
)-away (or Ω(1/N
2
)-away if |1 A(z)
2
B(z)
2
| = O()) from the unit circle. In
particular, with the prescribed accuracy 2
R
there is no numerical ambiguity to determine
whether a root is inside the unit disk. That is, the list D of Eq. (12) can be obviously
computed under our bounded precision arithmetic.
Let us analyze the evaluation accuracy of e(z) for z U (1) more closely. When evaluating
a linear factor z r of e(z) where both z and r are accurate up to additive error 2
R
,
the number of lost signiﬁcant bits is O(log(N/)) which is negligible compared to R. The
function value of e(z) is thus evaluated accurately up to relative error 2
R+O(log(N/))
. Hence,
the function value of c(z) (the real part of e(z)
α) and d(z) (the imaginary part of e(z)
α)
by Eq. (14) are determined up to additive error 2
R+O(log(N/))
.
More concretely but still loosely, let us assume 24N
2
2
R
1
1/(16N). The relative
error of a linear factor z r is at most 2 · 2
R
(4N
2
/). The factor of e(z) for the complex
roots can be evaluated to relative error at most 3·2·2
R
(4N
2
/). There are at most N factors
in e(z), so the value of e(z) is determined up to relative error δ = (1 + 24N
2
2
R
1
)
N
1
48N
3
2
R
1
1/8. This in turn gives an upper bound 200N
3
2
R
1
of the real part c(z) and the imaginary part d(z) since they have magnitude at most 1 on the
unit circle.
Lemma 5. If a real-on-circle Laurent polynomial f(z) of degree d 1 satisﬁes 0 < m
f(z) M for all z U(1), then every zero of f is at least m/(4Md
2
)-away from U(1).
Proof. Pick any root z
0
and choose the closest point u U(1) so that f(z
0
= u + η) = 0. We
will lower bound the magnitude of η. If |η| 1/(2d), then there is nothing to prove, so we
assume |η| < 1/(2d). Since f is analytic except at z = 0, the Taylor series of f at u converges
at z
0
= u + η.
0 = f(u + η) =
X
k0
f
(k)
(u)
k!
η
k
. (21)
Let us estimate the magnitude of derivatives at u U (1). Since the coeﬃcients a
j
of the
polynomial f(u) =
P
d
j=d
a
j
u
j
are the Fourier coeﬃcients, meaning that a
j
is a “weighted”
average of the function values on the unit circle, we know |a
j
| M. Thus
|f
(k)
(u)| 2d · M · d(d + 1) ···(d + k 1); (22)
there are 2d terms (or one more if k = 0 in which case the inequality is true anyway) and the
maximum absolute value of the exponent increases by at most 1 every time we diﬀerentiate
due to the negative degree term. Therefore,
m |f(u)|
X
k1
2dM
d + k 1
k
!
|η|
k
= 2dM
1
(1 |η|)
d
1
2dM · 2d|η| (23)
Accepted in Quantum 2019-09-27, click title to verify 11
where the ﬁrst inequality is the assumption, the second inequality is by rearranging Eq. (21)
and applying triangle inequality, and in the last inequality we use the fact that (1 x)
d
1
is a convex increasing function of positive x and valued at most 1 at x = 1/(2d) for d 1.
Step 4. This is essentially expanding the polynomials c(z) and d(z) found in the root
ﬁnding step, but we use the fast Fourier transform (FFT) for its better accuracy. It has been
shown [16] that the FFT on a k-component input F where each component (that is assumed
to be a complex number in [16]) is accurate to relative error δ, gives Fourier coeﬃcients
˜
ˆ
F
ω
with error
max
ω
|
˜
ˆ
F
ω
ˆ
F
ω
| O(k
1/2
log k) · δ
s
1
k
X
ω
|
ˆ
F
ω
|
2
(24)
where
ˆ
F
ω
= k
1
P
k
=1
e
iω/k
F (e
i/k
) is the true Fourier spectrum. (Note the normalization
factor k
1
here.) In our case, F consists of 2-by-2 matrices, and Eq. (24) also holds with
Frobenius or operator norms in place of absolute values. Since the input “vector” F in this
Step is a list of unitary matrices, the root-mean-square factor is O(1), and the distinction
between relative and absolute error is immaterial. By the analysis of Step 3 above, δ is
2
R+O(log(N/))
. Thus, the (additive) error δ
2n
in any Fourier coeﬃcient C
(2n)
2j
is at most
2
R+O(log(N/))
. The error here is the operator norm of the diﬀerence between the computed
and the true C
(2n)
2j
.
Crude and concrete bounds can be obtained more directly (without using Eq. (24)): The
coeﬃcients of a(z) and b(z) are exactly known. Those of c(z) and d(z) are computed from
the value table of e(z) from the previous Step, which has entry-wise additive error δ
48N
3
2
R
1
. The (slow) discrete Fourier transform on a k-component vector v is the matrix
multiplication by V = k
1/2
U for a k × k unitary U . (Hence, kV k k
1/2
.) If we compute
the input-independent trigonometric factors in the FFT, often called the twiddle factors,
accurately up to additive error 2
R
, then k
1/2
U is accurate up to additive error δ
F F T
k
1/2
2
R
in operator norm. (This bound is loose compared to that in the previous paragraph,
since in this paragraph we do not consider the fast Fourier transform that is numerically more
stable.) Since k 2N + 1, the conversion from
-norm to 2-norm incurs a factor of at
most
2N + 1. Hence, the additive error δ
2n
in C
(2n)
2j
for any j is at most k
˜
V ˜v V vk
max
k
˜
V ˜v V vk
2
k
˜
V V k · k˜vk
2
+ kV k · k˜v vk
2
δ
F F T
2N + 1 + δ. That is,
δ
2n
400N
3
1
2
R
. (25)
Step 5. This is an implementation of Theorem 2. Thanks to the condition (iv) of Step 1,
we know
C
(2n)
±2n
/N. (26)
Put F (t
2
) = E
0
E
1
(t) ···E
2n
(t). Then, for any m = 1, 2, . . . , 2n, we observe that C
(m)
±m
is
the product
C
(m)
m
= E
0
P
1
P
2
···P
m
, C
(m)
m
= E
0
Q
1
Q
2
···Q
m
. (27)
Accepted in Quantum 2019-09-27, click title to verify 12
In particular, by the submultiplicative rule of operator norm we have
C
(m)
±m
C
(2n)
±2n
/N ; (28)
that is, the leading coeﬃcients never become smaller in norm through the loop over m.
Let δ
m
be the maximum additive error of C
(m)
k
for any k. We assume that δ
m
/(2N).
For brevity, let C = C
(m)
±m
, and let
˜
C be the approximate C due to numerical error. Then,
δ
m
/(2N) kCk/2 and kCk/2 kCk δ
m
˜
C
kCk + δ
m
2kCk. Now,
˜
C
˜
C C
C
˜
C
˜
C C
+
˜
C
C
kCk
2kCk · δ
m
+ δ
m
· kCk (29)
= 3kCkδ
m
Tr(
˜
C
˜
C) Tr(C
C)
1
X
j=0
hj|
˜
C
˜
C C
C |ji
6kCkδ
m
(30)
Tr(C
C) kCk
2
(31)
Tr(
˜
C
˜
C)
˜
C
2
1
4
kCk
2
(32)
Hence, we see that the additive error β
m
in P
(m)
(or Q
(m)
) is
˜
P
(m)
P
(m)
˜
C
˜
C C
C
Tr(
˜
C
˜
C)
+
C
C
Tr(C
C) Tr(
˜
C
˜
C)
Tr(
˜
C
˜
C) Tr(C
C)
4
kCk
2
· 3kCkδ
m
+
4
kCk
2
·
1
kCk
2
· kCk
2
· 6kCkδ
m
36N
1
δ
m
. (33)
In turn, assuming β
m
1 we have
δ
m1
= max
k
˜
C
(m)
k1
˜
Q
m
+
˜
C
(m)
k+1
˜
P
m
C
(m)
k1
Q
m
C
(m)
k+1
P
m
max
k
˜
C
(m)
k1
˜
Q
m
C
(m)
k1
˜
Q
m
+
C
(m)
k1
˜
Q
m
C
(m)
k1
Q
m
+ (similar terms)
2(2δ
m
+ β
m
)
76N
1
δ
m
. (34)
Combining Eqs. (25) and (34) we conclude that
β
0
:= δ
0
400N
3
1
(76N
1
)
2N
2
R
=: Γ. (35)
Step 6. Now we have approximate E
0
0
and P
m
up to β
m
(m =
1, . . . , 2n). The evaluation of E
m
(t) is then accurate up to 2β
m
for t U (1), and that of
a(t
2
) + ib(t
2
) = h+|E
0
E
1
(t) ···E
2n
(t) |+i is accurate up to (suppressing t for brevity)
˜
E
0
···
˜
E
2n
E
0
···E
2n
2n
X
m=0
E
0
···E
m1
˜
E
m
···
˜
E
2n
E
0
···E
m
˜
E
m+1
···
˜
E
2n
2n
X
m=0
2β
m
2n
Y
=m+1
(1 + 2β

)
. (36)
Accepted in Quantum 2019-09-27, click title to verify 13
We want this to be smaller than . A suﬃcient condition is thus
(2N + 1)2Γ (1 + 2Γ)
2N+1
(37)
or R & 2N log
2
(N/). (38)
All the (ad hoc) assumptions in the course of error estimation above, are satisﬁed by this
choice of R.
The correctness of the algorithm is clear from the construction and error analysis above.
The ﬁnal error guarantee is from the condition (ii) of Step 1 and Eq. (37).
4.3 Computational complexity
All arithmetic in the algorithm above operates with at most R-bit numbers, where R is
chosen to be Θ(N log(N/)). The number of elementary bit operations (AND, OR, N OT ) to
perform one basic arithmetic operation (+, , ×, /) on u-bit numbers is upper bounded by
O(u polylog(u)) [17].
8
Let us count the number of arithmetic operations.
Step 1 : Assuming that the coeﬃcients of the true function A(z) and B(z) are given
to accuracy O(/N), it takes O(N log(N/)) arithmetic operations to ﬁnd rational approx-
imations. There could be O(polylog(N/)) additional cost in full complexity in simplifying
rational numbers by Euclid’s algorithm.
Step 2 : The root ﬁnding takes time O(N
3
+ N
2
R) O(N
3
log(N/)) [15]; this count
includes all the cost of bit operations for high precision arithmetic.
Step 3 : Selecting roots inside the unit disk requires O(N) absolute value evaluations and
O(N) comparisons. The polynomial function evaluation involves O(N) arithmetic operations.
We need O(N ) values, resulting in O(N
2
) arithmetic operations.
Step 4 : The FFT requires O(N log N) arithmetic operations given O(log N) trigonometric
function values, which can be computed by Taylor expansions of order R, invoking O(R log N)
arithmetic operations. These result in arithmetic complexity O(N log(N ) log(N/)) for the
FFT.
Step 5 : Updating the Fourier coeﬃcient list C
(m)
k
involves O(N) arithmetic operations,
which we do for O(N) times, resulting in O(N
2
) arithmetic operations.
Overall, the computational complexity is O(N
3
polylog(N/)), under the random-access
memory model of computation.
Practically, it may be useful to run the algorithm with arithmetic precision with, say,
R
0
= 64 bits initially, test the ﬁnal decomposition on O(N)-th roots of unity (which takes
only O(N
2
) operations with O(log(N/))-bit arithmetic), and repeat the whole process under
an exponential scheduling on the number R
r+1
= 2R
r
of bits of precision in the arithmetic,
until the test reveals that the answer is acceptable. In this way the arithmetic uses no more
than twice the number of bits of precision that is actually needed to handle the numerical
instability of our algorithm for a given input, without knowing a tailored number beforehand.
For the upper bound R = O(N log(N/)) in the worst case, there will be at most r
max
=
O(log N + log log(N/)) rounds, but the overall time complexity is a constant multiple of the
last round’s due to the exponential scheduling.
8
This reference is concerned with multiplications, but the division has essentially the same cost using the
Newton’s method [18, 4.3.3 R].
Accepted in Quantum 2019-09-27, click title to verify 14
5 Application to Hamiltonian simulation
In this section, we review and redo existing analysis [1, 19], complemented with a minor
modiﬁcation to our algorithm exploiting a certain structure of the eigenvalue transformation
function.
Suppose we are given with a unitary W whose eigenvalues e
±
λ
are associated with
those λ of a Hermitian matrix H =
P
λ |λihλ| with kHk 1 as
sin θ
λ
= λ. (39)
The correspondence between W and H might seem contrived at this stage, but when H is
represented as a linear combination of unitaries [5, 6], it is possible to construct such W as
a quantum circuit [3]. The relation of Eq. (39) is in fact common whenever quantum walk is
used [20, 21]. (See Appendix A for some detail.) So, the desired transformation is
f : e
±
λ
7→ e
sin θ
λ
h+|F (e
λ
/2
) |+i (40)
where F (t) should be constructed by quantum signal processing [1]. This will implement
e
H
. Since the product of n primitive matrices yields a Fourier component of frequency at
most n/2, F (t) must consist of at least 2τ factors. (The factor of 2 is due to the half-angle in
the argument of F .) Note that the success probability of the post-selection is close to 1 since
|f(e
λ
)| = 1.
With e
= z, we write
exp( sin ϕ) = exp
τ
z z
1
2
!
=
X
kZ
J
k
(τ)z
k
(41)
where J
k
are the Bessel functions of the ﬁrst kind; one can take Eq. (41) as a deﬁnition of the
Bessel functions. This is the Fourier series of ϕ 7→ exp( sin ϕ). The substitution τ τ
and z z together with the uniqueness of the Fourier series implies J
k
(τ) = (1)
k
J
k
(τ).
Similarly, the substitution z 1/z implies J
k
(τ) = (1)
k
J
k
(τ). We separate the reciprocal
and anti-reciprocal parts of the expansion as
exp
τ
z z
1
2
!
=
X
k2Z
J
k
(τ)
z
k
+ z
k
2
| {z }
A(z)
+ i
X
k2Z+1
J
k
(τ)
z
k
z
k
2i
| {z }
B(z)
. (42)
This expansion, called the Jacobi-Anger expansion, converges absolutely at a superexponential
rate. We can use the steepest descent method [22] which is generally applicable. Expressing
the Fourier transform as a contour integral we see
J
k
(τ) =
1
2πi
Z
C
dz
z
k+1
e
(τ/2)(zz
1
)
where C is the unit circle. Since the integrand is analytic except for z = 0, we may deform C.
For 2k > τ > 0, z 2k > 1 is a saddle point for the absolute value of the integrand. We
take C to be a circle through this point, to have that
|J
k
(τ)|
τ
2k
k
Z
2π
0
dθ
2π
exp[(k (τ
2
/4k)) cos θ]
2k
k
for 2k > τ > 0,
|J
k
(τ)|
e|τ|
2|k|
|k|
for k Z \ {0}, τ R. (43)
Accepted in Quantum 2019-09-27, click title to verify 15
It is important that the convergence of the series depends on the size of the region on which
the function is analytic this is a general fact [22]. For the Jacobi-Anger expansion the
function is almost entire, and the convergence is superexponential. Note that [23, 9.1.62]
asserts |J
k
(τ)| |τ/2|
|k|
/|k|! for any τ R and k Z which is tighter than Eq. (43).
Now, a partial sum of the Jacobi-Anger expansion can be written as
X
k:|k|≤N
J
k
(τ)z
k
= J
0
(τ) +
X
even k:2kN
J
k
(τ)(z
k
+ z
k
)
| {z }
˜
A(z)
+ i
X
odd k:1kN
J
k
(τ)
z
k
z
k
i
| {z }
˜
B(z)
.
(44)
This is -close to the full expansion if N = Ω(|τ| + log(1/)) by Eq. (43) for any z U(1).
9
Numerical experiments suggest that the bound is quite tight and it suﬃces to choose
N
e
2
|τ| + ln(1/) 1.36|τ| + 2.30 log
10
(1/). (45)
The Laurent polynomials
˜
A(z) and
˜
B(z) are pure real-on-circle. Applying our algo-
rithm, we obtain real-on-circle Laurent polynomials a(z) = a
+
(z), b(z) = ib
(z), c(z) =
ic
(z), d(z) = d
+
(z). The pure Laurent polynomials c(z), d(z) are calculated by Lemma 4
where we choose c(z) to be anti-reciprocal and d(z) reciprocal. (This choice is to have our
results in the same convention as those of Ref. [2].) Note that every exponent of z of the
polynomial 1 a(z)
2
b(z)
2
here, whose roots must be computed, is even since a(z) has only
even exponents and b(z) has only odd exponents, so it is always better to feed a Laurent
polynomial g of degree n, instead of 2n, where
g(z
2
) = 1 a(z)
2
b(z)
2
(46)
into the root ﬁnding routine. Given the expanded form of 1 a(z)
2
b(z)
2
, it takes no eﬀort
to ﬁnd g. In this case, the intermediate polynomial e(z) in Eq. (13) of Lemma 4 is
e(z) =
Y
cC : g(c)=0, |c|<1
z
c
z
. (47)
We have implemented our algorithm with constants chosen as above using Wolfram Math-
ematica 11, and measured the running time as a function of τ for two ﬁxed values of . The
result is shown in Fig. 1. The running time scales asymptotically as the cubic of τ as expected.
We used internal routines of Mathematica for the rational approximation, high precision arith-
metic, root ﬁnding, and Fourier transform. The computing was by Microsoft Surface Book
with Intel Core i7-6600U at 2.6 GHz and 16 GB of RAM.
6 Application to Matrix inversion
While there are slightly more eﬃcient implementations of matrix inversion problems [9] using
quantum signal processing [4], here we contend ourselves with an eigenvalue transformation
9
The proof of this is along the same lines as proving the convergence of Taylor series for e.g. the exponential
function, and is left to the reader.
Accepted in Quantum 2019-09-27, click title to verify 16
10 50 100 500 1000
Tau0.1
1
10
100
1000
10
4
(sec)
Figure 1: Running time of our algorithm for the Jacobi-Anger expansion as a function of τ , implemented in
Wolfram Mathematica 11. The upper red data set has = 10
9
and the lower blue = 10
4
. The number
of decimal digits in the intermediate steps is (n/3) ln(n/) where n is the degree of the input Laurent
polynomial and the factor of 3 is empirically chosen to make the numerical error negligible. The straight
lines represent functions const · τ
γ
of exponents γ = 3.11 for = 10
9
(red) and γ = 3.17 for = 10
4
(blue). The top right data point has 2172 primitive matrices in the decomposition.
perspective. The techniques of Refs. [4] reduces the number of ancilla qubits by one or two,
and hence relieves some burden of implementing controlled unitary, but the underlying mathe-
matics, regarding polynomial approximations and ﬁnding interspersing single-qubit unitaries,
is unchanged.
Suppose a hermitian matrix H of norm 1 that we wish to invert is block-encoded in a
unitary W so that W has eigenvalues e
±
λ
associated with an eigenvalue λ of H. This
encoding is the same as in the Hamiltonian simulation above. The condition for H being
hermitian is not too restrictive since, for any matrix M, an enlarged matrix |0ih1| M +
|1ih0|M
is always hermitian. Then, we want eigenvalue transformation e
±
λ
7→ 1/ sin θ
λ
.
As we should not invert a singular matrix, we assume that eigenvalues of H are bounded away
from zero by 1 where κ 1 is the condition number of H. (Stricly zero eigenvalues are
ﬁne if we are interested in a pseudo-inverse.) Thanks to the condition number assumption,
we need to ﬁnd a polynomial approximation to the function e
±
λ
7→ 1/ sin θ
λ
that is good
for values sin θ
λ
away from zero by 1. For this purpose, there is a useful polynomial [24]:
Lemma 6. Let > 0, κ 1 and z U(1). Suppose integers b b
0
1 satisfy b κ
2
ln(2/)
and b
0
p
b ln(8/). For sin ϕ = (z z
1
)/(2i) R with |sin ϕ| 1 > 0, we have
2i
κ(z z
1
)
2i
2
2b
κ(z z
1
)
b
0
X
k=b
0
2b
b + k
!
(1 z
2k
)
| {z }
f(z)
. (48)
Accepted in Quantum 2019-09-27, click title to verify 17
Moreover, for all z U(1) we have |f(z)| 2b
0
.
The function f(z) is a genuine Laurent polynomial
10
since the sum vanishes at z = ±1.
Proof. First, let b b
0
1 be any integers, and let z = e
U(1) be any complex number.
Then,
1
z + z
1
2
!
2b
| {z }
g(z)
1
2
2b
b
0
X
k=b
0
2b
b + k
!
(1 z
2k
)
| {z }
h(z)
4e
b
02
/b
(49)
because
2
2b
P
k : |k|>b
0
2b
b+k
(1 z
2k
)
2
2b+1
P
k : |k|>b
0
2b
b+k
and Hoeﬀding’s inequality
on the tail of binomial probability distributions implies Eq. (49). If sin ϕ =
zz
1
2i
with
|sin ϕ| 1 > 0, then
|1 g(z)| =
z + z
1
2
!
2b
e
b/κ
2
, (50)
since (1 sin
2
ϕ)
b
e
b/κ
2
whenever |sin ϕ| 1.
Thus, for large b we see that 1 ((z + z
1
)/2)
2b
vanishes when sin ϕ = 0 (i.e., z = ±1),
but is close to 1 for |sin ϕ| 1. By Eq. (49) this function can be replaced with a lower
degree polynomial function. Indeed, for |sin ϕ| 1 we have
2i
κ(z z
1
)
2ih(z)
κ(z z
1
)
|1 h(z)| e
b/κ
2
+ 4e
b
02
/b
, (51)
which implies Eq. (48).
For the last claim, we observe a chain of (in)equalities: For any integer k 1 we have
(z z
1
)(z
k1
+ z
k3
+ ··· + z
k+3
+ z
k+1
) = z
k
z
k
,
|(z
k
z
k
)/(z z
1
)| k for z U(1),
|sin
2
kϕ/ sin ϕ| |sin kϕ/ sin ϕ| k for ϕ R,
|(2 z
2k
z
2k
)/(z z
1
)| 2k for z U(1).
The function f(z) is the “average” over k of
(2 z
2k
z
2k
)i
(z z
1
)κ
with respect to a subnormalized probability distribution. Therefore the claim follows.
10
The polynomial of Eq. (48) is the same as that in [24, Lemma 17-19]. The bound there is similar to
ours, but the polynomial degree is worse than ours by a factor of log κ. This diﬀerence is due to a diﬀerent
normalization we approximate 1/(κx) rather than 1/x. Our analysis might look simpler, but it is not due
to a “new” approach; the “diﬀerence” is only in the usage of exponential functions rather than trigonometric
functions.
Accepted in Quantum 2019-09-27, click title to verify 18
Choosing b
0
= dκ ln(8/)e, and feeding a real-on-circle anti-reciprocal Laurent polynomial
f(z)/ ln(8/), which is at most 1 in magnitude by the last claim of Lemma 6, into our algo-
rithm, we obtain a desired eigenvalue inversion quantum algorithm. The success probability
can be as small as Ω(1/(κ log(1/))
2
), and hence we had better amplify the amplitude for
post-selection, enlarging the quantum gate complexity by a factor of O(κ log(1/)). Overall,
the quantum gate complexity is proportional to the product of the degree of the Laurent
polynomial above and the number of iterations for the amplitude ampliﬁcation.
7 Discussion
We have determined the scope of SU(2)-valued periodic polynomial functions and their de-
composition (Theorem 2), and analyzed the algorithmic aspects. Our algorithm for the de-
composition is not numerically stable in a usual sense a numerically stable algorithm should
only require polylog(N/) bits of precision, rather than poly(N log(1/)). The instability ap-
pears to be unavoidable in any method that reduces polynomial degree iteratively by one at a
time (such as our Step 5), at least in the early stage of the polynomial degree reduction. The
numerical error arises due to the small norm of leading coeﬃcients in our algorithm, and the
small leading coeﬃcients of an input polynomial are necessary if a nonpolynomial function
admits converging polynomial approximations. However, it might be the case that the lead-
ing coeﬃcient matrix C
(m)
m
becomes large in norm rather quickly in Step 5 of our algorithm,
in which case our analysis could be loose. For a schematic example, consider an identity
t
2n
P + t
2n
(I P ) =
tP + t
1
(I P )
2n
for any projector P . The numerical instability to
decompose the left-hand side into the right-hand side is far less severe than that in the worst
case of Step 5, and similar situations might occur during the execution of Step 5 for some
class of input functions. This deserves further investigation.
Acknowledgments
I thank Guang Hao Low for valuable discussions, and Robin Kothari for useful comments on
the manuscript.
A Jordan’s Lemma and block encoding of Hamiltonians
The following is a well-known fact, but we include it here for completeness.
Lemma 7. Let P and Q be arbitrary self-adjoint projectors on a ﬁnite dimensional complex
vector space V . Then, V decomposes into orthogonal subspaces V
j
invariant under P and Q,
where each V
j
has dimension 1 or 2.
For any hermitian operator H, let supp(H) denote the subspace support of H, i.e., the
orthogonal complement of the kernel of H. Clearly, supp(H) is an invariant subspace of H,
and H is invertible within supp(H).
Proof. We will ﬁnd a subspace W of dimension at most 2 that is invariant under both P and
Q. This is suﬃcient since the orthogonal complement of W is also invariant and the proof
will be completed by induction in the dimension of V .
Accepted in Quantum 2019-09-27, click title to verify 19
Put P
0
= I P and Q
0
= I Q, and consider the identities
P QP + P Q
0
P + P
0
QP
0
+ P
0
Q
0
P
0
= I, (52)
supp(P QP ) + supp(P Q
0
P ) + supp(P
0