Fermat's Library | q-2021-02-11-395 annotated/explained version.

Rapid mixing of path integral Monte Carlo for 1D

stoquastic Hamiltonians

Elizabeth Crosson

1

and Aram W. Harrow

2

1

Center for Quantum Information and Control, University of New Mexico.

2

Center for Theoretical Physics, Massachusetts Institute of Technology.

Path integral quantum Monte Carlo (PIMC) is a method for estimating thermal

equilibrium properties of stoquastic quantum spin systems by sampling from a classical

Gibbs distribution using Markov chain Monte Carlo. The PIMC method has been

widely used to study the physics of materials and for simulated quantum annealing, but

these successful applications are rarely accompanied by formal proofs that the Markov

chains underlying PIMC rapidly converge to the desired equilibrium distribution.

In this work we analyze the mixing time of PIMC for 1D stoquastic Hamiltonians,

including disordered transverse Ising models (TIM) with long-range algebraically decay-

ing interactions as well as disordered XY spin chains with nearest-neighbor interactions.

By bounding the convergence time to the equilibrium distribution we rigorously justify

the use of PIMC to approximate partition functions and expectations of observables

for these models at inverse temperatures that scale at most logarithmically with the

number of qubits.

The mixing time analysis is based on the canonical paths method applied to the

single-site Metropolis Markov chain for the Gibbs distribution of 2D classical spin mod-

els with couplings related to the interactions in the quantum Hamiltonian. Since the

system has strongly nonisotropic couplings that grow with system size, it does not fall

into the known cases where 2D classical spin models are known to mix rapidly.

1 Introduction

The application of Markov chain Monte Carlo methods to thermal states of a limited class of quan-

tum systems was ﬁrst proposed by Suzuki, Miyashita and Kuroda 40 years ago [43]. Their method

applied to Hamiltonians with real and nonpositive oﬀ-diagonal matrix elements in a particular

basis, a property that was called “being free of the sign problem” for many years until the term

“stoquastic” was eventually adopted [12] to emphasize a connection with nonnegative matrices.

Since then there have been a growing number of works on rigorous classical simulations for various

stoquastic systems [8, 9, 11, 15]. Many quantum systems of physical and computational interest

are stoquastic, including spinless particles moving on arbitrary graphs with position dependent

interactions, as well as generalized transverse Ising models, which are particularly notable for their

use in quantum annealing [1, 19, 31].

Elizabeth Crosson: crosson@unm.edu

Aram W. Harrow : aram@mit.edu

1

arXiv:1812.02144v3 [quant-ph] 5 Feb 2021

Statement of results. A general n-qubit 1D stoquastic Hamiltonian with nearest-neighbor in-

teractions has the form,

H =

n



j=1

H

j,j+1

, (1)

where each H

j,j+1

acts non-trivially on (at most) the sites j, j + 1 (with n + 1 identiﬁed with 1)

and each ∥H

j,j+1

∥ ≤ 1. Deﬁne the “computational basis” {|1⟩, |−1⟩} for each qubit to be the

eigenstates of the σ

z

operator. We assume H is stoquastic in the computational basis [11, 12],

which means that in this basis each H

j,j+1

is a real symmetric matrix with nonpositive oﬀ-diagonal

entries,

⟨z|H

j,j+1

|z

′

⟩ ≤ 0 for all z, z

′

∈ {−1, 1}

n

with z ̸= z

′

. (2)

For a given inverse temperature β we consider the quantum partition function Z

β

and the quantum

Gibbs state ρ

β

,

ρ

β

:=

e

−βH

Z

β

, Z

β

:= tr



e

−βH



, (3)

and also the expectation value of observables Q in the thermal state, ⟨Q⟩

β

:= tr[Qρ

β

]. For the

special class of systems called 1D generalized transverse Ising models (TIM) we go beyond (1)

and allow systems with long-range interactions that decay suﬃciently quickly (faster than inverse

square) with distance,

H = −

n



j=1

Γ

j

σ

x

j

+

n



j=1

K

z

j

σ

z

j

+

n



j,k=1

K

zz

jk

σ

z

j

σ

z

k

, Γ := min

j=1,...,n

Γ

j

> 0, (4)

K

z

j

∈ [−1, 1] , and |K

zz

jk

| ≤ |i − j|

−(2+ξ)

for ξ > 0. (5)

The form (5) simpliﬁes both the presentation of the PIMC method and our proof that the mixing

time of PIMC for these models is at most poly(n, e

β

, Γ

−1

). We also show the mixing time of PIMC

is at most poly(n, e

β

, Γ

−1

) for generalized TIM with σ

x

σ

x

and σ

y

σ

y

interactions,

H = −

n



j=1

Γ

j

σ

x

j

−

n



j=1

K

xx

j

σ

x

j

σ

x

j+1

−

n



j=1

K

yy

j

σ

y

j

σ

y

j+1

+

n



j=1



K

zz

j

σ

z

j

σ

z

j+1

+ K

z

j

σ

z

j



, (6)

K

zz

j

, K

z

j

∈ [−1, 1] , K

xx

j

≥ 0 , K

yy

j

∈ [−K

xx

j

, K

xx

j

] and Γ := min

j=1,...,n

Γ

j

> 0.

The mixing time bounds can be used to construct an FPRAS (fully polynomial-time randomized

approximation scheme) for the partition function of models of the form (5) and (6) and inverse

temperatures β = O(log n). In addition to the partition function there is a similar FPRAS for the

expectation value of observables that are sparse and row-computable (in the computational basis)

in the thermal state ρ

β

for β = O(log n). For general 1D stoquastic Hamiltonians (1) we prove

that the PIMC mixing time is at most quasi-polynomial in the system size, which yields a quasi-

polynomial time approximation algorithm for the partition function and for the approximation of

observables.

Theorem 1. There is an algorithm which takes as input an inverse temperature β and an n-qubit

1D stoquastic Hamiltonian H and outputs an estimate

˜

Z

β

of the quantum partition function Z

β

.

If also given an observable Q ⪰ 0 that is sparse and row-computable in the computational basis,

the algorithm outputs an estimate

˜

Q of its expectation value in the state ρ

β

.

2

These estimates satisfy

|

˜

Z

β

− Z

β

| ≤ δ

mult

Z

β

+ δ

add

(7)

|

˜

Q − tr[Qρ

β

]| ≤ δ

mult

tr[Qρ

β

] + δ

add

. (8)

with probability ≥ 1 −δ

fail

. The algorithm runs in time at most

poly(n, e

β

, Γ

−1

, δ

−1

mult

, log



δ

−1

add



, log(δ

−1

fail

))

for models of the form (5) and (6), and in time at most

poly(2

β



log

n

δ

mult



2

, log(1/δ

add

), log(1/δ

fail

))

for models of the form (1).

Relation to previous work. A variety of quantum Monte Carlo methods have been devised for

estimating equilibrium properties of stoquastic Hamiltonians, which can be broadly divided into

diﬀusion methods [21, 27, 40] and path integral methods [2, 36, 39, 42, 43]. Diﬀusion Monte Carlo

methods use Hamiltonian matrix elements to deﬁne transition rates for a substochastic random

process, along with birth and death population dynamics or importance sampling to control sta-

tistical ﬂuctuations. Examples of proven eﬃcient convergence for diﬀusion Monte Carlo methods

include simulating frustration-free stoquastic adiabatic computation [11] and simulating the “go

with the winners” method [26]. Disadvantages of diﬀusion methods include quantum amplitude

vs quantum probability obstructions for some variants [13, 27], as well as the need for importance

sampling with trial wave functions that may in general not be available [8].

PIMC methods, in contrast, are based on Markov chains deﬁned on an enlarged state space con-

sisting of many coupled copies of a high-temperature classical system along a so-called “imaginary-

time” direction [43]. PIMC has been used to give an FPRAS for the ferromagnetic case of TIM [8]

on general interaction graphs by a reduction to a classic result on sampling the Gibbs distribution

of classical ferromagnetic Ising models [28]. Also, the Glauber dynamics for an inﬁnite dimensional

variant of PIMC has been shown to have optimal temporal mixing for ferromagnetic TIM at high

temperature [14] and for ferromagnetic TIM on regular trees [35]. Here “ferromagnetic” means of

the form in (5) but with each K

z

j

= 0 and each K

zz

j

≤ 0, i.e. lower energy is achieved by aligning

adjacent spins. Since such interactions cannot be frustrated, these models are considered easier

to analyze. Disadvantages of PIMC methods include the signiﬁcant overhead created by adding

the imaginary-time direction, and also the possibility of slow mixing due to topological obstruc-

tions [3, 23]. The general presence of obstructions to rapid mixing for the standard versions of both

diﬀusion and path integral methods also motivates the continuing development of new algorithms

for simulating stoquastic Hamiltonians [2, 9].

To our knowledge, the simulations in this work are the ﬁrst provably eﬃcient Monte Carlo

methods for a large class of generalized TIM systems, which diﬀer from ferromagnetic TIM by

allowing couplings of both signs between neighboring spins, as well as local ﬁelds of both signs.

A generalized TIM in 2D or higher dimensions can encode NP-complete constraint satisfaction

problems, which is the basis for the use of these models in quantum annealing [1, 19, 31]. Besides

their relative simplicity from the standpoint of experimental implementation, generalized TIM on

interaction graphs of degree 3 are also known to be universal for stoquastic quantum annealing [10,

3

17]. The application of Monte Carlo methods to stoquastic quantum annealing is called simulated

quantum annealing, and a signiﬁcant open question is whether some variant of simulated quantum

annealing can eﬃciently simulate stoquastic quantum annealing. This open question has motivated

several recent proposals for non-stoquastic quantum annealing [16, 25, 41, 46]. Some recent progress

has been made by analyzing speciﬁc models for which both quantum annealing and simulated

quantum annealing can be exponentially faster than classical simulated annealing [6, 7, 15, 20, 30,

32, 37, 38].

Another class of simulation methods for 1D quantum systems are those based on matrix product

states [45]. These methods do not require the Hamiltonian to be stoquastic and have been applied

in rigorous classical simulations of systems with limited entanglement [4, 22, 33]. Their runtime

scales exponentially with spectral gap, while our runtime bound for PIMC scales exponentially

with the inverse temperature, which is qualitatively similar. These kinds of simulations are not

expected to obsolete PIMC in general, however, as the latter is used in practice to simulate highly

entangled stoquastic systems. Furthermore, the version of PIMC that we consider is closely related

to the practical implementations of PIMC, and so this work can be seen as a rigorous analysis of

a heuristic approximation algorithm that was originally developed and applied with great success

by the physics community.

Organization of the remaining sections. Section 2 reviews the PIMC method and sketches

the proof of our mixing time analysis. Section 3.1 deﬁnes the Suzuki-Trotter approximation to

the partition function which is used to map 1D stoquastic Hamiltonians onto 2D classical spin

systems. Section 3.2 and Section 3.3 describe the use of samples from the Gibbs distribution of

the 2D classical system to approximate quantum observables and the quantum partition function.

Section 3.4 proves a concentration theorem for PIMC that justiﬁes the analysis of a restricted state

space in later sections. Section 4.1 reviews the canonical paths method that we use to analyze the

PIMC mixing time. This method is then applied to generalized TIM in Section 4.3, generalized

TIM with XX interactions in Section 4.4, and general 1D stoquastic Hamiltonians in Section 4.5.

2 Overview

In the PIMC method the Hamiltonian (1) is related by Suzuki’s quantum-to-classical mapping

[42, 44] to a system of {±1} classical spins on a 2D lattice Λ = {1, . . . , L} × {1, . . . , n}. The new

site index {1, . . . , L} is known in physics as the “imaginary-time” direction, and it is often useful

to think of the set of spin conﬁgurations Ω := {−1, 1}

n×L

as consisting of “time slices” along the

L direction, i.e. for z ∈ Ω we write z := (z

1

, . . . , z

L

) with each z

i

∈ {−1, 1}

n

. The mapping

introduces a local energy function E

β

: Ω → R on the classical spin conﬁgurations in such a way

that an estimate of the classical partition function Z :=



z∈Ω

e

−βE

β

(z)

can be used to estimate the

quantum partition function Z, and so that samples from the Gibbs distribution π(z) := e

−βE

β

(z)

/Z

can be used to estimate expectation values in the quantum Gibbs state. For the nearest-neighbor

version of the transverse Ising models in (5) the energy function E

β

is

E

β

(z) =



(i,j)∈Λ



K

zz

j,j+1

L

z

i,j

z

i,j+1

+

K

z

j

L

z

i,j

− β

−1

J

i

z

i,j

z

i+1,j



, z

L+1,j

:= z

1,j

, (9)

4

where z

i,j

denotes the spin in conﬁguration z at the site (i, j) ∈ Λ, and J

i

:=

1

2

log coth



βΓ

i

L



(see

Section 3.1). The couplings along the imaginary-time direction are ferromagnetic, but the couplings

along the spatial direction can have varying signs as well as local ﬁelds. Our choices of parameters

are such that L = poly(n), and so the general case considers J

i

with magnitude up to O(log n).

Therefore the couplings of the 2D model are highly anisotropic, as in ﬁgure 1.

space

imaginary-time

Figure 1: The local ﬁelds and the couplings along the spatial direction can be either positive

or negative but have magnitudes reduced by 1/L (dotted lines), whereas the couplings along

the imaginary-time direction can have magnitude growing logarithmically with system size

but are always ferromagnetic (solid lines).

The PIMC method proceeds by attempting to sample from π and approximate Z using the

Markov chain Monte Carlo method with single-site Metropolis transition probabilities. With this

in mind, the primary question is to determine the mixing time τ for this Markov chain, which is

the minimum number of steps of this Markov chain needed to sample from a distribution that is

close to π. We will use the method of canonical paths that was originally introduced by Jerrum and

Sinclair (reviewed in [29, 34]) in order to bound the mixing time. The idea of the method is that

for every pair of states x, y ∈ Ω the Markov chain P needs to route an amount of “traﬃc” equal to

π(x)π(y) along some path γ

x,y

which connects x to y using the transitions of the Markov chain.

Deﬁne the congestion across a particular transition e = (z, z

′

) to be the ratio of the total traﬃc

through it (i.e.



x,y:e∈γ

x,y

π(x)π(y)) to the amount of probability ﬂow across it at equilibrium,

which equals π(z)P (z, z

′

). The mixing time can then be upper bounded in terms of the maximum

congestion through any transition, using standard arguments which we review in Section 4.1.

Our problem now reduces to ﬁnding good sets of paths between every pair of conﬁgurations

x, y ∈ Ω. We use a set of paths inspired by those used to analyze the (unweighted) random walk

on the boolean hypercube [29]. Speciﬁcally we will replace the entries of x with entries of y one

at a time. A crucial requirement is that the energies of the intermediate states do not get too

large, as this would create a large congestion. For the model (9) the paths ﬁrst update the classical

spins corresponding to the site of qubit 1 in consecutive imaginary-time order, and then proceed

to do the same for the sites of qubits 2 through n. For each qubit j ∈ {1, . . . , n} the line of

sites {(1, j), (2, j), . . . , (L, j)} is called the worldline of the j-th qubit, and so these canonical paths

have the form of updating each qubit worldline consecutively, always ﬁnishing one worldline before

starting the next.

An example of an intermediate step along this path is depicted in Fig. 2. Notice that intermedi-

ate steps have only introduced a constant number of broken bonds between spins that are neighbors

5

y

1,5

y

1,4

y

1,3

y

1,2

y

1,1

y

2,1

y

2,2

y

2,3

y

2,4

y

2,5

y

3,1

y

3,2

x

3,3

x

3,4

x

3,5

x

4,5

x

4,4

x

4,3

x

4,2

x

4,1

x

5,1

x

5,2

x

5,3

x

5,4

x

5,5

x

6,5

x

6,4

x

6,3

x

6,2

x

6,1

space n

imaginary-time L

Figure 2: An intermediate point along a canonical path from x to y. We have ﬁnished updating qubits

1 and 2 and are in the middle of updating qubit 3. The red line indicates the domain wall formed

between the x and y conﬁgurations. The change in the classical energy created by this domain wall

determines the Markov chain congestion for this set of canonical paths.

in the imaginary time direction, each contributing an amount of energy that is O(β

−1

log n), and

at most L broken bonds between spins that are neighbors in the space direction, each contributing

energy O(1/L). The relatively low energy of the intermediate conﬁgurations along the path is the

key to the formal proof of rapid mixing for PIMC applied to 1D generalized TIM in Section 4.3.

For 1D generalized TIM with long-range interactions, the energy of the intermediate conﬁguration

z will include contributions from spins that are arbitrarily far away from the cut. Using the as-

sumption on the rate of decay of the magnitude of these interactions, the total contribution to the

energy is upper bounded by

n/2



j=1

n



k=

n

2

+1

|j − k|

−(2+ξ)

= O(ξ

−2

+ n

−ξ

),

where the bound is proven by viewing the summation as a right Riemann sum of the corresponding

integral, which yields an upper bound because the integrand is a monotonically decreasing function.

Since we regard ξ as a constant this contribution to the energy is O(1).

For Hamiltonians of the form (6) the 2-local oﬀ-diagonal terms give rise to a classical energy

function with interactions involving four spins at a time,

E

β

(z) :=



(i,j)∈Λ



K

zz

j

L

z

i,j

z

i,j+1

+

K

z

j

L

z

i,j

− β

−1

h

i,j

(z

i,j

, z

i,j+1

, z

i+1,j

, z

i+1,j+1

)



, (10)

The h

i,j

terms (which are implicitly deﬁned by the Gibbs distribution (22)) correspond to strong

ferromagnetic interactions between neighboring pairs of spins in neighboring time slices, and violat-

ing them increases the energy by an amount that is O(log n). If z

i,j

̸= z

i+1,j

and z

i,j+1

̸= z

i+1,j+1

then we say there is a 2-local jump between time slice i and time slice i + 1, because the result-

ing energy contribution from the 4-body interaction will depend on the magnitude of the 2-local

oﬀ-diagonal quantum terms i.e. K

xx

j

and K

yy

j

.

6

The canonical paths we have described so far will lead to intermediate conﬁgurations that

potentially create or destroy a non-constant number of 2-local jumps across the domain wall pictured

in ﬁgure 2. If K

xx

j

is larger than K

x

j

, then conﬁgurations x and y which contain a non-constant

number of 2-local jumps will result in the canonical paths described so far overloading some edges

with large congestion. To overcome this we include additional path segments along the canonical

paths that remove 2-local jumps from the worldlines near the domain wall in ﬁgure 2. This greatly

increases the number of paths passing through each edge, but we are still able to obtain a poly(n, e

β

)

upper bound on the congestion by noticing that most of these paths have endpoints with very small

stationary probabilities.

For Hamiltonians of the form (1) it is possible that the oﬀ-diagonal terms in (1) do not allow

for the PIMC Markov chain with single-site transitions to be ergodic. In this situation some

conﬁgurations z ∈ Ω will have π(z) = 0, which creates a new obstacle to the canonical paths we

have described so far. To restore ergodicity under single-site updates we add a ﬁctitious transverse

ﬁeld to the Hamiltonian with a weak coupling that goes to zero with system size:

H → H − Γ

n



i=1

σ

x

i

(11)

with Γ ≤ δ

mult

/n so that the subsequent estimate for the partition function will not be further than

the allowed error tolerance away from its intended value.

Once the ﬁctitious ﬁeld is added it once again makes sense to attempt to use the same canonical

paths that were applied to systems of the form (6). But now the fact that the oﬀ-diagonal matrix

elements of the Hamiltonian are no longer symmetric under the global operation of ﬂipping all

spins means that removing a single (1-local or 2-local) jump from a given worldline can decrease

the stationary probability of that conﬁguration by a factor scaling exponentially in the number of

jumps remaining in that worldline. To overcome this problem we show in Section 3.4 that typical

conﬁgurations distributed according to π will have at most O(β log n) jumps in any particular

worldline. This means that the worst-case intermediate conﬁgurations just described may have a

stationary probability reduced by n

O(β log n)

, and this factor bottlenecks the mixing time estimate.

Using these facts we describe canonical paths in Section 4.5 within a restricted state space Ω

∗

in

which every conﬁguration has O(β log n) jumps per worldline, and by either restricting the PIMC

Markov chain to Ω

∗

or applying the leaky random walk conductance bound [15] we obtain a bound

on the mixing time within the subset Ω

∗

that establishes theorem 1.

3 Path integral Monte Carlo

Sections 3.1 through 3.3 rigorously present the underpinnings of the standard PIMC method: the

quantum-to-classical mapping, and its application to approximating expectation values of observ-

ables as well as the partition function. Section 3.4 proves a general concentration of measure bound

which shows that the number of jumps along any worldline in PIMC is O(β log n) with high proba-

bility. Note that this result applies to general stoquastic Hamiltonians (not just the 1D case). This

result is used to analyze a restricted state space in Section 4.5.

3.1 Quantum-to-classical mapping

The ﬁrst step of the quantum-to-classical mapping is to split H as in (1) into a sum of commuting

Hamiltonians. For each local term H

i

we construct local operators A

i

, B

i

, C

i

such that −βH

i

=

7

A

i

+ B

i

+ C

i

and [A

i

, A

j

] = [B

i

, B

j

] = [C

i

, C

j

] = 0 for all i, j. Let A =



i

A

i

, B =



i

B

i

,

C =



i

C

i

, and hereafter let A

i

match the diagonal part of H

i

in the computational basis, while

B

i

and C

i

contain the oﬀ-diagonal terms of H

i

when i is odd or even, respectively. For positive

integers L deﬁne the Suzuki-Trotter approximation of the partition function Z = tr



e

A+B+C



,

Z

β,L

:= tr





e

A

2L

e

B

L

e

A

2L

e

C

L



L



. (12)

The fact that lim

L→∞

Z

β,L

= Z

β

is the content of the 1875 Lie product formula, while for ﬁnite L

the Suzuki-Trotter approximation is related to the quantum partition function (3) by

e

−δ

Z

β

≤ Z

β,L

≤ e

δ

Z

β

, (13)

for some δ that is O



∥H∥

3

/L

3



. This error bound follows from two facts about the matrix expo-

nential: a variant of the usual Suzuki-Trotter expansion, and a continuity bound. Both appear in

[8] for transverse Ising models, and the proof for three commuting layers A, B, C is similar and so

we omit the details.

Before considering the general case of H as in (1) it is useful to consider the simpliﬁed case of

(12) for generalized TIM of the form (5) by taking A = −β diag(H), B = −β (H − diag(H)), and

C = 0 so that the Suzuki-Trotter approximation reduces to

Z

β,L

= tr





e

A

2L

e

B

L

e

A

2L



L



= tr





e

A

L

e

B

L



L



. (14)

For transverse Ising models, the next step is to expand the Suzuki-Trotter approximation (14)

to the partition function by inserting several complete sets of states,

Z

β,L

=



z

1

∈{−1,1}

n

⟨z

1

|



e

A

L

e

B

L



L

|z

1

⟩ (15)

=



z

1

,...,z

L



i=1

⟨z

i

|e

A

L

e

B

L

|z

i+1

⟩ , z

L+1

:= z

1

(16)

=



z

1

,...,z

L



i=1

e

A(z

i

)

L

⟨z

i

|e

B

L

|z

i+1

⟩. (17)

where in the last step we have used the fact that A = −β diag(H) is diagonal in the computational

basis and deﬁned A(z

i

) := ⟨z

i

|A|z

i

⟩. The Gibbs distribution of the classical 2D spin system is

deﬁned by equating e

−βE

β

(z

1

,...,z

L

)

with the summand of (17). The closed form (9) is obtained

applying the identity e

ωσ

x

= cosh(ω)I + sinh(ω)σ

x

to

⟨z

i

|e

B

L

|z

i+1

⟩ = ⟨z

i

|e

−

βΓ

L



n

j=1

σ

x

j

|z

i+1

⟩ =

n



j=1

⟨z

i,j

|e

−

βΓ

L

σ

x

j

|z

i+1,j

⟩.

For a general 1D stoquastic Hamiltonian of the form in (1) we will apply (12) with

A = −β diag(H) , B = −β



i∈Odd

[H

i

− diag(H

i

)] , C = −β



i∈Even

[H

i

− diag(H

i

)] , (18)

8

Expanding (12) with 2L complete sets of states,

Z

β,L

=



z

1

∈{−1,1}

n

⟨z

1

|



e

A

2L

e

B

L

e

A

2L

e

C

L



L

|z

1

⟩ (19)

=



z

1

,...,z

2L

2L−1



i∈Odd

⟨z

i

|e

A

2L

e

B

L

|z

i+1

⟩⟨z

i+1

|e

A

2L

e

C

L

|z

i+2

⟩ (20)

=



z

1

,...,z

2L

e



2L

i=1

A(z

i

)

2L

2L−1



i∈Odd

⟨z

i

|e

B

L

|z

i+1

⟩⟨z

i+1

|e

C

L

|z

i+2

⟩, (21)

where periodic boundary conditions z

2L+1

:= z

1

are taken in (20). At this point the essential

diﬀerence from the transverse Ising case is clear: the oﬀ-diagonal part of the Hamiltonian is split

into two commuting layers, these layers alternate along the imaginary-time direction, and this

causes the coupling between classical spins at neighboring time slices to alternate as well. For ease

of notation deﬁne B

i

to be 2B if i is odd, and 2C if i is even, and rescale L → L/2 to obtain a

distribution π from (21),

π(z) :=

1

Z

β,L

e



L

i=1

A(z

i

)

L



i=1

⟨z

i

|e

B

i

L

|z

i+1

⟩. (22)

3.2 Approximating observables

In this section we consider an observable O for which we would like to estimate ⟨O⟩

β

:= tr[Oρ

β

].

We justify this by demonstrating closeness of ⟨O⟩

β

to a value ⟨O⟩

β,L

that is deﬁned in terms of the

Suzuki-Trotter approximation in equation (34). The expectation ⟨O⟩

β

can be expressed as the

derivative of a partition function by deﬁning

Z

β

(ζ) := tr[e

−βH+ζO

], (23)

and using the fact that

⟨O⟩

β

=

1

Z

β

(0)

∂Z

β

∂ζ

(0). (24)

The next proposition shows that ⟨O⟩

β

can be approximated by a ﬁnite diﬀerence.

Proposition 2. Suppose that

ˆ

Z

β

(0) and

ˆ

Z

β

(ζ) satisfy

e

−δ

≤

ˆ

Z

β

(0)

Z

β

(0)

,

ˆ

Z

β

(ζ)

Z

β

(ζ)

≤ e

δ

,

for ζ =

√

3δ/∥O∥. Then the estimator



⟨O⟩

β

:=

ˆ

Z

β

(ζ) −

ˆ

Z

β

(0)

ζ

ˆ

Z

β

(0)

. (25)

satisﬁes the bound

|



⟨O⟩

β

− ⟨O⟩

β

| ≤ 2

√

δ∥O∥ (26)

9

Proof. We use ﬁrst Taylor’s theorem, then the bound tr AB ≤ ∥A∥∥B∥

1

to obtain:

⃓

Z

β

(ζ) − Z

β

(0) − ζ

∂Z

β

∂ζ

(0)

⃓

≤

ζ

2

max

ζ

′

∈[0,ζ]

∂

2

Z

β

∂ζ

2

(ζ

′

) (27a)

≤ ζ

2

∥O∥

2

max

ζ

′

∈[0,ζ]

Z

β

(ζ

′

) (27b)

≤ ζ

2

∥O∥

2

Z

β

(0)e

ζ∥O∥

. (27c)

Dividing both sides of (27) by ζZ

β

(0) we have

⃓

⟨O⟩

β

−

Z

β

(ζ) − Z

β

(0)

ζZ

β

(0)

⃓

=

⃓

⟨O⟩

β

−



Z

β

(ζ)

ζZ

β

(0)

−

1

ζ



⃓

≤ ζ∥O∥

2

e

ζ∥O∥

. (28)

Then



⟨O⟩

β

− ⟨O⟩

β

=

ˆ

Z

β

(ζ)

ζ

ˆ

Z

β

(0)

−

1

ζ

− ⟨O⟩

β

(29)

≤ e

2δ

Z

β

(ζ)

ζZ

β

(0)

−

1

ζ

− ⟨O⟩

β

(30)

≤ (e

2δ

− 1)

Z

β

(ζ)

ζZ

β

(0)

+ ζ∥O∥

2

e

ζ∥O∥

(31)

≤ (3δ/ζ + ζ∥O∥

2

)e

ζ∥O∥

(32)

≤ 2

√

δ∥O∥ (33)

In the last step we have used the assumption that δ ≤ 1/21 to simplify the numerical constants. A

similar argument shows that



⟨O⟩

β

− ⟨O⟩

β

≥ −2

√

δ∥O∥, thus completing the proof.

By (25), (13), and the triangle inequality, the expectation ⟨O⟩

β

can be approximated by

⟨O⟩

β,L

=

1

Z

β,L(0)

∂

∂ζ

Z

β,L

(0), (34)

where Z

β,L

(ζ) is a Suzuki-Trotter approximation to (23). If O is diagonal use the form in Section 3.1

with A → A + ζO,

Z

β,L

(ζ) =



z∈Ω

e

1

L



L

i=1

(A(z

i

)+ζO(z

i

))

L



i=1

⟨z

i

|e

B

i

L

|z

i+1

⟩. (35)

Evaluating the derivative at ζ = 0,

1

Z

β,L

(0)

∂

∂ζ

Z

β,L

(0) =

1

Z

β,L



z∈Ω



1

L



i=1

O(z

i

)



e

1

L



L

i=1

A(z

i

)

L



i=1

⟨z

i

|e

B

i

L

|z

i+1

⟩ (36)

=



z∈Ω

π(z)



1

L



i=1

O(z

i

)



=

1

L



i=1

⟨O(z

i

)⟩

π

= ⟨O(z

1

)⟩

π

(37)

10

where the ﬁnal expression follows by the symmetry of π under cyclic permutations of the time

slices. For oﬀ-diagonal O,

Z

β,L

(ζ) =



z∈Ω

e

1

L



L

i=1

A(z

i

)

L



i=1

⟨z

i

|e

1

L

(B

i

+ζO/2)

|z

i+1

⟩. (38)

Evaluating the derivative at ζ = 0,

⟨O⟩

β,L

=

1

Z

β,L

(0)



z∈Ω

e

1

L



L

i=1

A(z

i

)

L



k=1

⟨z

k

|



O

2L



e

B

k

L

|z

k+1

⟩

L



i=1

i̸=k

⟨z

i

|e

B

i

L

|z

i+1

⟩ (39)

=



z∈Ω

⎛

⎝

1

2L

L



k=1

⟨z

k

|Oe

B

k

L

|z

k+1

⟩

⟨z

k

|e

B

k

L

|z

k+1

⟩

⎞

⎠

π(z) (40)

=

1

2L

L



k=1



⟨z

k

|Oe

B

k

L

|z

k+1

⟩

⟨z

k

|e

B

k

L

|z

k+1

⟩



π

(41)

=

1

2



z∈Ω

⎡

⎣

⟨z

1

|Oe

B

1

L

|z

2

⟩

⟨z

1

|e

B

1

L

|z

2

⟩

+

⟨z

2

|Oe

B

2

L

|z

3

⟩

⟨z

2

|e

B

2

L

|z

3

⟩

⎤

⎦

π(z) (42)

where the last step follows by symmetry of π under even cyclic permutations of the time slices.

Since the summand of (42) is eﬃciently computable and has bounded magnitude we can estimate

it using a standard Monte Carlo procedure (see lemma 3) once one has an eﬃcient meanns of

obtaining independent samples from π.

3.3 Approximating partition functions

In Section 4 we will prove that for any tolerance δ > 0 the PIMC method can eﬃciently output an

element of Ω sampled from a distribution ˜π which is within |˜π −π| ≤ δ. Here we explain how these

samples can be used to approximate the partition function with an application of the standard

telescoping product technique [5, 28]

1

. We express the partition function Z(β) as a product of

terms involving the partition function at higher temperatures,

Z(β) = Z(0)

Z(β

1

)

Z(β

0

)

. . .

Z(β

k

)

Z(β

k−1

)

(43)

where the temperature schedule β

0

< β

1

< . . . < β

k

with β

0

= 0 and β

k

= β is chosen so that each

ratio in the telescoping product is bounded. Deﬁning w

β

i

(z) := Z(β

i

)π

β

i

(z), note that



w

β

i

w

β

i−1



π

β

i−1

=



z∈Ω

w

β

i

(z)

w

β

i−1

(z)

π

β

i−1

(z) =

Z(β

i

)

Z(β

i−1

)

. (44)

1

The techniques used in this section to estimate the partition function (or equivalently the free energy) are

necessary because the entropy term in the free energy cannot be directly estimated as an observable. If one is only

interested in the estimation of speciﬁc observables then it suﬃces to use the end result(42).

11

At inﬁnite temperature the quantum partition function is Z(0) = 2

n

. Further, a uniform schedule

with k = O(β∥H∥log(β∥H∥)) ensures that

1

e

≤

Z(β

i

)

Z(β

i−1

)

≤ 1. (45)

Note that in inﬁnite temperature limit β = 0 the classical eﬀective systems (9) and (10) have

inﬁnite coupling strength between spins in the imaginary-time direction, and so the Markov chain

we analyze in Section 4 will not be ergodic. This is ununsual from the perspective of classical spin

systems, but it happens in this case because of the temperature dependence of the couplings. To

deal with this the ﬁrst expectation ⟨w

β

1

/w

β

0

⟩

π

β

0

in the telescoping product can be computed using

the fact that π

β

0

is equivalent to the uniform distribution on the subset Ω

classical

⊆ Ω consisting of

conﬁgurations (z

1

, . . . , z

L

) which have z

i

= z

j

for all i, j = 1, . . . , L.

Finally, in order to bound the number of samples needed to compute the expectation values in

the telescoping product we will make use of the Hoeﬀding bound.

Lemma 3 ([24]). Let X

1

, . . . , X

t

be independent random variables satisfying |X

i

| ≤ 1 and E[X

i

] =

¯

X. Then

P[

⃓

1

t



i=1

X

i

−

¯

X

⃓

≥ δ] ≤ 2e

−tδ

2

/2

(46)

Deﬁne M (z) = e

−2δ

w

β

i

(z)/w

β

i+1

(z), so that |M(z)| ≤ 1. Observe that |⟨M⟩

π

− ⟨M⟩

˜π

| ≤

∥π −˜π∥

1

∥M∥ ≤ δ. We will estimate ⟨M⟩

˜π

by drawing t samples from ˜π, which we call z

(1)

, . . . , z

(t)

.

Our estimator will simply be the sample mean, namely

ˆ

¯

M :=

1

t



i=1

M(z

i

). (47)

By Lemma 3 we have

P[|

ˆ

¯

M − ⟨M⟩

˜π

| ≥ δ] ≤ 2e

−tδ

2

/2

. (48)

We conclude that taking t = O(1/δ

2

) samples yields an O(δ) approximation to ⟨M⟩

˜π

with high

probability. Therefore a polynomial number of samples will suﬃce to estimate each term of the

telescoping product.

3.4 Restricting the PIMC state space

In this section we justify restricting PIMC to a subset of the state space that limits the number of

jumps in any particular worldline to be O(β log n) (this restricted state space will be used for the

mixing time analysis in Section 4.5). For a conﬁguration z = (z

1

, ..., z

L

) the number of jumps in

the worldline of the j-th qubit can be expressed in terms of the Pauli operator X

j

,

d

j

(z) := |{i : z

i,j

̸= z

i+1,j

}| =

L



i=1

⟨z

i

|X

j

|z

i+1

⟩.

The strategy is to bound the k-th moment of the random variable d

j

with respect to the QMC

stationary distribution π and use the fact that

Pr[d

j

≥ a]

π

≤

π

d

k

j

a

k

.

12

These moments satisfy

π

d

k

j

:=



z∈Ω

d

k

j

(z)π(z)

=



z∈Ω



L



i=1

⟨z

i

|X

j

|z

i+1

⟩



k

π(z)

=



z∈Ω

L



i

1

,...,i

k

=1

k



r=1

⟨z

i

r

|X

j

|z

i

r

+1

⟩π(z)

=

1

Z

β,L



z∈Ω

L



i

1

,...,i

k

=1

k



r=1

⟨z

i

r

|X

j

|z

i

r

+1

⟩



e



L

i=1

A(z

i

)

L



i=1

⟨z

i

|e

B

i

L

|z

i+1

⟩⟩



. (49)

where A, B

i

are deﬁned in Section 3.1. To merge the two products in the expression above into a

simpler trace expression, we want to show the following inequality for any neighboring time slices

z

i

and z

i+1

,

⟨z

i

|X

j

|z

i+1

⟩ ≤ 4

β

L

⟨z

i

|X

j

e

B

i

/L

|z

i+1

⟩

⟨z

i

|e

B

i

/L

|z

i+1

⟩

, (50)

where we assume the local terms are normalized so that for some Γ > 0

Γ ≤ |⟨z

i

|H|z

i+1

⟩| ≤ 1 ∀z

i

, z

i+1

: |z

i

− z

i+1

| = 1.

If ⟨z

i

|X

j

|z

i+1

⟩ = 0 then (50) is satisﬁed. Otherwise if ⟨z

i

|X

j

|z

i+1

⟩ = 1 then the numerator on the

RHS of (50) satisﬁes

⟨z

i

|X

j

e

B

i

/L

|z

i+1

⟩ = ⟨z

i+1

|e

B

i

/L

|z

i+1

⟩ ≥ e

−β∥H∥/L

≥

1

2

. (51)

To upper bound the denominator of the RHS of (50) , the fact that B

i

is a commuting Hamiltonian

implies

⟨z

i

|e

B

i

/L

|z

i+1

⟩ = ⟨z

i,j

z

i,j+1

|e

B

i,j

/L

|z

i+1,j

, z

i+1,j+1

⟩

=

∞



q=1

1

q!

⟨z

i,j

z

i,j+1

|(B

i,j

/L)

q

|z

i+1,j

, z

i+1,j+1

⟩

≤ e

β∥H

j

∥

L

− 1

≤

1

ln(2)

β

L

,

where in the last step we have used ∥H

j

∥ ≤ 1,

β

L

≤ ln(2) and the convexity of e

x

. Using ln(2)/2 ≥

1/4 we have established (50).

To apply this result to (49) we will use the permutation symmetry of the i

1

, ..., i

k

variables to

restrict to a sum over ordered k-tuples (i

1

, ..., i

k

) which satisfy i

1

≤ i

2

≤ . . . i

k

.

π

d

k

j

≤

k!

Z



z∈Ω



1≤i

1

≤...≤i

k

≤L

k



r=1

⟨z

i

r

|X

j

|z

i

r

+1

⟩



e



L

i=1

A(z

i

)

L



i=1

⟨z

i

|e

B

i

L

|z

i+1

⟩⟩



(52)

≤

4

k

k!β

k

ZL

k



z∈Ω



1≤i

1

≤...≤i

k

≤L

k



r=1

⟨z

i

r

|X

j

e

B

i

L

|z

i

r

+1

⟩

⟨z

i

r

|e

B

i

L

|z

i

r

+1

⟩



e



L

i=1

A(z

i

)

L



i=1

⟨z

i

|e

B

i

L

|z

i+1

⟩⟩



(53)

13

Deﬁning t

r

= (i

r

− i

r−1

)β/L (as well as t

1

= r

1

β/L, t

k

= (L − i

k

)β/L) this expression can be seen

as the trace of a product of operators,

π

d

k

j

≤

k!(4β)

k

ZL

k

L



t

1

,...,t

k

t

1

+...+t

k

=β

tr [M

t

1

R

t

1

X

j

L

t

2

M

t

2

R

t

2

X

j

...X

j

L

t

k

M

t

k

] (54)

where

M

t

i

:=



e

B

even

/L

e

A/L

e

B

odd

/L

e

A/L

e

B

even

/L



⌊t

i

β/L⌋−2

and L

i

,R

t

i

can each denote one of several products of operators depending on t

i

, which is determined

by where the X

j

operators disrupt the sequence,

L

t

i

=

⎧

⎪

⎨

⎪

⎩

e

A/L

e

A/L

e

B

odd

/L

e

A/L

e

B

odd

/L

e

A/L

e

A/L

e

B

odd

/L

e

A/L

e

B

even

/L

, R

t

i

=

⎧

⎪

⎨

⎪

⎩

e

B

even

/L

e

A/L

e

B

even

/L

e

B

odd

/L

e

A/L

e

B

even

/L

e

A/L

e

B

odd

/L

e

A/L

e

B

even

/L

.

The purpose of this decomposition is that each M

t

i

is a PSD operator raised to an positive integer

power, and the rest of the Matrices L

t

i

, R

t

i

, X

j

are all either PSD operators or are explicit products

of 2,3, or 4 PSD operators. Now the traces in (54) can be bounded using classic inequality due to

Ky Fan [18, Theorem 1]. Let P

1

...P

N

satisfy P

i

⪰ 0 for each i, and let σ(P

i

) be a diagonal matrix

with the eigenvalues of P

i

along the diagonal in descending order. Then

tr [P

1

...P

N

] ≤ tr [σ(P

1

)...σ(P

N

)]

For our purposes deﬁne the notation σ(R

t

i

) to be the product of diagonal matrices of eigenvalues

of the operators that make up R

t

i

e.g. if R

t

i

= e

A/L

e

B

even

/L

then σ(R

t

i

) = σ



e

A/L



σ



e

B

even

/L



.

Furthermore we may take B

even

, B

odd

⪰ 0 without loss of generality (by shifting the overall Hamil-

tonian by a multiple of the identity), so that ⪰ σ(R

t

i

), σ(L

t

i

) for each t

i

. Therefore each trace in

(54) satisﬁes

tr [M

t

1

R

t

1

X

j

L

t

2

M

t

2

R

t

2

X

j

...X

j

L

t

k

M

t

k

]

≤ tr [σ (M

t

1

) σ (R

t

1

) σ (X

j

) σ (L

t

2

) σ (M

t

2

) σ (R

t

2

) σ (X

j

) ...σ (X

j

) σ (L

t

k

) σ (M

t

k

)] (55)

≤ tr [σ (M

t

1

) σ (M

t

2

) ...σ (M

t

k

)] (56)

= tr





e

B

even

/L

e

A/L

e

B

odd

/L

e

A/L

e

B

even

/L



L−2k



= Z

β,L−2k

where the fact that ⪰ σ(R

t

i

), σ(L

t

i

) is used to go from (55) to (56). Now for k ≪ L we have

Z

β,L

≈ Z

β,L−2k

, so this implies that

π

d

k

j

≤ k!β

k

. Therefore taking k = c log n for some c > 0 and

using k! ≤ (k/e)

k

yields

Pr

π

[d

j

≥ cβ log n] ≤

k!β

k

c

k

β

k

log

k

b

=

k!

k

≤ e

−k

= n

−c

.

14

4 Proofs of rapid mixing

Having established the correctness of the PIMC method in section 3 we now procede to analyze

the mixing time of the associated Markov chains. The mixing time analysis is divided into cases.

In section 4.1 we review the main techniques that are common to each case in our analysis. Section

4.3 contains the mixing analysis for transverse Ising models, while sections 4.4 and 4.5 analyze

Hamiltonians of the form (6) and (1) , respectively.

4.1 Markov chains, mixing times, and canonical paths

Let P be the transition matrix of a reversible Markov chain deﬁned on a state space Ω with

stationary distribution π. Let z be the state of the chain at time t = 0, and deﬁne P

t

(z, ·) to be

the distribution of the state of the chain at time t. The variation distance of the chain at time t

starting from the initial state z at time t = 0 is

d

z

(t) := max

A⊆Ω

|P

t

(z, A) − π(A)| =

1

2



z

′

∈Ω

|P

t

(z, z

′

) − π(z

′

)| (57)

Deﬁne τ(ϵ) to be the worst-case time needed to be within variation distance ϵ of the stationary

distribution,

τ(ϵ) := max

z∈Ω

min

t

{t : d

z

(t

′

) ≤ ϵ ∀t ≥ t

′

}. (58)

Deﬁne the mixing time to be τ

mix

:= τ(1/4), and note that τ(ϵ) = τ

mix

log ϵ

−1

[34].

To bound the mixing time of P we use the method of canonical paths [29]. A path from x ∈ Ω

to y ∈ Ω is a sequence γ

x,y

= (v

1

, . . . , v

k

) of states with v

1

= x, v

k

= y and P (v

i

, v

i

+ 1) > 0

for i = 1, . . . , k − 1. A set P = {γ

x,y

} containing a path that connects every pair of states is

called a set of canonical paths. If ℓ = max

γ∈P

|γ| is the maximum length of any path in P and

π

min

:= min

z∈Ω

π(z), then the mixing time τ

mix

is O(R ℓ ln π

−1

min

), where the congestion R is deﬁned

as

R := max

(v,v

′

)∈E(Ω)

1

π(v)P (v, v

′

)



γ

x,y

∈P

(v,v

′

)∈γ

x,y

π(x)π(y) (59)

In order to compute the congestion we will use a standard technique called encoding. For

an edge e = (v, v

′

), let P(e) be the set of paths in P which pass through e. An encoding is an

assignment of an injective function η

e

: P(e) → Ω to every edge e ∈ E(Ω). More generally, it can be

useful to consider injective encodings η

e

: P(e) → Ω × Θ that map at most G = |Θ| paths through

the e to each state in Ω [28]. Let η

e

(γ

x,y

) = (η

e

(x, y), θ

e

(x, y)), and suppose there is some M such

that

π(x)π (y) ≤ Mπ (v) π (η

e

(x, y)) ∀x, y ∈ Ω, (60)

15

then the congestion satisﬁes

R = max

(v,v

′

)∈E(Ω)

1

π(v)P (v, v

′

)



γ

x,y

∈P

(v,v

′

)∈γ

x,y

π(x)π(y) (61)

≤ max

(v,v

′

)∈E(Ω)

M · G

π(v)P (v, v

′

)



γ

x,y

∈P

(v,v

′

)∈γ

x,y

π(v)π(η

e

(x, y)) (62)

≤ M · G · P

−1

min

(63)

where P

min

:= min

(z,z

′

)∈E(Ω)

P (z, z

′

), (62) follows from property (60), and (63) follows from the

injectivity property of the encoding.

4.2 PIMC transition probabilities

To sample from the classical Gibbs distribution (22) the PIMC method uses a Markov chain which

chooses a site in the classical lattice Λ uniformly at random and proposes to ﬂip the bit at that

site to make a transition z → z

′

with an acceptance probability P(z, z

′

) given by the Metropolis

rule [34],

P (z, z

′

) :=

⎧

⎨

⎩

1

2nL

min



1,

π(z

′

)

π(z)



, z ̸= z

′

1 −



z

′′

̸=z

P (z, z

′′

), z = z

′

. (64)

As long as the state space is a connected graph, the transition matrix P with transitions satisfying

(64) will converge to the stationary distribution π.

For the sake of lower bounding the minimum transition probability P

min

from the previous

section it therefore suﬃces to lower bound π(z

′

)/π(z) where z and z

′

diﬀer only on a single site.

Suppose z diﬀers from z

′

at position (i, j), then after canceling common factors in (22) we have

π(z

′

)

π(z)

= e

1

L

[

A(z

′

i

)−A(z

i

)

]

⟨z

i−1

|e

B

i−1

L

|z

′

i

⟩⟨z

′

i

|e

B

i

L

|z

i+1

⟩

⟨z

i−1

|e

B

i−1

L

|z

i

⟩⟨z

i

|e

B

i

L

|z

i+1

⟩

. (65)

Depending on the oﬀ-diagonal matrix elements in B

i

, the numerator of (65) can potentially be

zero. This means that the state space Ω may not be connected by single bit ﬂips, because the

stationary distribution does not have support on all of Ω. This can be avoided in general by adding

a 1-local transverse ﬁeld ﬁeld with suﬃciently small magnitude Γ to avoid disturbing the resulting

approximation. With this we can lower bound the ratio (65) using the fact that B

i

is a nonnegative

matrix, which implies

⟨z|



I +

B

i

L



|z

′

⟩ ≤ ⟨z|e

B

i

L

|z

′

⟩ for all z, z

′

∈ {−1, 1}

n

. (66)

The denominator of (65) is upper bounded by 1, and (66) implies that the numerator is at least

(βΓ/L)

2

, therefore

π(z

′

)

π(z)

≥ e

−

4β

L



βΓ

L



2

. (67)

16

4.3 Transverse Ising models

In this section we will describe a set of canonical paths which will be used to show rapid mixing

of PIMC applied to Hamiltonians of the form (5). Let x, y ∈ Ω be spin conﬁgurations with

x = (x

1,1

, . . . , x

L,1

, . . . , x

1,n

. . . , x

L,n

) and y = (y

1,1

, . . . , y

L,1

, . . . , y

1,n

. . . , y

L,n

).

It will be convenient to choose an ordering on the space [L] × [n]. We order bits ﬁrst by the

second coordinate and then the ﬁrst, so that (r, l) ⪯ (s, m) if l ≤ m or if r ≤ s and l = m. For any

1 ≤ r ≤ L and 1 ≤ m ≤ n deﬁne

[≤ (r, m)] := ({1, . . . , L} × {1, . . . , m − 1}) ∪ ({1, . . . , r} × {m}) (68)

[> (r, m)] := ({r + 1, . . . , L} × {m}) ∪ ({1, . . . , L} × {m + 1, . . . , n}) . (69)

The canonical path γ

x,y

from x to y consists of simply replacing each bit of x with the corre-

sponding bit of y following the above ordering. This can be thought of as n steps, each of which

consists of L substeps that correspond to transitions of the Markov chain. The r-th substep of the

m-th step consists of ﬂipping the bit in the r-th Trotter slice of the m-th qubit if x

r,m

̸= y

r,m

or

leaving it alone if x

r,m

= y

r,m

. In the former case, this contributes an edge (v, v

′

) ∈ E(Ω) with

v = (y

1,1

, . . . , y

r−1,m

, x

r,m

, x

r+1,m

, . . . , x

L,n

) (70)

v

′

= (y

1,1

, . . . , y

r−1,m

, y

r,m

, x

r+1,m

. . . , x

L,n

). (71)

In the latter case (i.e. x

r,m

= y

r,m

), this step does not add an edge to the path. Thus the number

of edges in γ

x,y

equals the number of positions on which x and y disagree, which is always ≤ nL.

In Section 4.1 we deﬁned an encoding of a path to be an injective map η

e

from the set of paths

through e to the state space Ω. The encoding we’ll use for this path is

η

(v,v

′

)

(γ

x,y

) := x · y · v, (72)

where · to denote elementwise multiplication. To see that x, y are uniquely determined by v

together with η = η

(v,v

′

)

(γ

x,y

), observe that v · η = x · y, which speciﬁes the bits that are ﬂipped

along the path γ

x,y

. Moreover, the edge (v, v

′

) speciﬁes the location (r, m) of the bit being ﬂipped

in the transition from v to v

′

. Let (v · η)

≤(r,m)

denote the vector which equals v · η = x · y

on the coordinates in [≤ (r, m)] and equals 1 on the coordinates in [> (r, m)]; deﬁne (v · η)

>(r,m)

analogously. Then we can recover x by calculating v

′

·(v ·η)

≤(r,m)

, and can recover y by calculating

v

′

· (v · η)

>(r,m)

.

For any subset of the lattice sites A ⊆ Λ, deﬁne the restriction E

A

β

of the classical energy

function E

β

by restricting the sum in (9) to sites (i, j) ∈ A, so that for all z ∈ Ω we have

E

A

β

(z) =



(i,j)∈A

⎛

⎝

K

z

j

L

z

i,j

− β

−1

J

i

z

i,j

z

i+1,j

+



k∈A

K

zz

jk

L

z

i,j

z

i,k

⎞

⎠

. (73)

Now we compute

E

β

(v) = E

≤(r,m)

β

(y) + E

>(r,m)

β

(x) + B (74)

E

β

(η) = E

≤(r,m)

β

(x) + E

>(r,m)

β

(y) + B

′

(75)

17

where

B := β

−1

J

m

(y

1,m

x

L,m

+ y

r,m

x

r+1,m

) +

1

L

r



i=1

K

z

m

y

i,m

+

1

L



i=r

K

z

m

x

i,m

+

1

L

r



i=1

K

zz

m,m+1

y

i,m

x

i,m+1

+

1

L



i=r

K

zz

m−1,m

y

i,m−1

x

i,m

+

1

L



i=1

m−1



j=1

n



k=m

K

zz

jk

y

j

x

k

,

B

′

:= β

−1

J

m

(x

1,m

x

L,m

+ x

r,m

x

r+1,m

) +

1

L

r



i=1

K

z

m

x

i,m

+

1

L



i=r

K

z

m

y

i,m

+

1

L

r



i=1

K

zz

m,m+1

x

i,m

y

i,m+1

+

1

L



i=r

K

zz

m−1,m

x

i,m−1

y

i,m

+

1

L



i=1

m−1



j=1

n



k=m

K

zz

jk

x

j

y

k

.

We also need

E

β

(x) = E

≤(r,m)

β

(x) + E

>(r,m)

β

(x) + B

X

(76)

E

β

(y) = E

≤(r,m)

β

(y) + E

>(r,m)

β

(y) + B

Y

(77)

where

B

X

:=

L



i=1



K

z

m

L

x

i,m

+ β

−1

J

m

x

i,m

x

i+1,m



+

1

L

r



i=1

K

zz

m,m+1

x

i,m

x

i,m+1

+

1

L



i=r

K

zz

m−1,m

x

i,m−1

x

i,m

+

1

L



i=1

m−1



j=1

n



k=m

K

zz

jk

x

j

x

k

,

and B

Y

is deﬁned similarly. The value of M needed to satisfy (60) for this encoding can now be

determined by

π(x)π(y)

π(v)π(η)

= e

−β

(

E

β

(x)+E

β

(y)−E

β

(v)−E

β

(η)

)

= e

β(B+B

′

−B

X

−B

Y

)

(78)

To upper bound the exponent in the worst-case deﬁne J := max

m=1,...,n

J

m

, then

β(B + B

′

− B

X

− B

Y

) ≤ 8J + 4β + 4β

n/2



j=1

n



k=

n

2

+1

K

zz

jk

.

To upper bound the sum we use the assumption K

zz

jk

≤ |j − k|

−(2+ξ)

and regard it as a lower

Riemann sum for an integral,

n/2



j=1

n



k=

n

2

+1

|j − k|

−(2+ξ)

≤



j=

n

2

j=1



k=n

k=

n

2

+1

djdk |j − k|

−(2+ξ)

=

1 − 2

ξ+1

n

−ξ

+ (n − 1)

−ξ

ξ

2

+ ξ

,

18

and since ξ > 0 it suﬃces in (63) to take M = O



e

8J+4β



. If the 1D quantum system has

periodic boundary conditions connecting qubit 1 to qubit n, then the same analysis would show

that M = O



e

8(J+β)



suﬃces for (60). Therefore by (61)-(63) the congestion R satisﬁes

R ≤ M · P

−1

min

= O



e

8(β+J)

· 2nL · e

2

(

J+

β

L

)

· max

(z,z

′

)



π(z

′

π(z)





the factor of 2nL comes from the chain being lazy and the choice of random single-site updates. Note

that max

(z,z

′

)

(π(z

′

/π(z)) is O(L

2

Γ

−2

) from (67) and also ℓ := max

x,y

|γ

x,y

| = nL and ln π

−1

min

=

ln(Z ·e

nLJ+nβ

) ≤ nL(J + β + ln 2), using Z ≤ 2

nL

. In Section 3.1 we show that L = poly(n, δ

−1

mult

)

suﬃces to approximate the partition function with the desired precision. Therefore the mixing

time bounds in Section 4.1 imply that the PIMC method generates samples as was assumed in

Section 3.3, which completes the proof of theorem 1 for models of the form (5).

4.4 Polynomial time mixing in theorem 1

In this section we will prove rapid mixing of PIMC for Hamiltonians of the form (6) by describing

paths between arbitrary states x, y ∈ Ω. It will be convenient to represent states in the form

z = [¯z

1

, . . . , ¯z

n

] with ¯z

j

:= (z

1,j

, . . . , z

L,j

) denoting the spin values along the j-th worldline in

conﬁguration z ∈ Ω. We say that a “double jump” occurs in conﬁguration z at position (i, j) if

(1 + z

i,j

z

i+1,j

) (1 + z

i,j+1

z

i+1,j+1

) ̸= 0. For any consecutive pair of worldlines j, j + 1, deﬁne the

number of double jumps d(¯z

j

, ¯z

j+1

),

d(¯z

j

, ¯z

j+1

) :=

1

4

L



i=1

(1 + z

i,j

z

i+1,j

) (1 + z

i,j+1

z

i+1,j+1

) . (79)

For any conﬁguration z and worldline j we will deﬁne a path segment to a new conﬁguration

[¯z

1

, . . . , ¯z

j−2

, ¯z

′

j−1

, ¯z

′′

j

, ¯z

′

j+1

, ¯z

j+2

, . . . , ¯z

n

] with all of the double jumps removed from worldline j,

d(¯z

′

j−1

, ¯z

′′

j

) = d(¯z

′′

j

, ¯z

′

j+1

) = 0 (80)

The double prime notation indicates that the j-th worldline does not contain any double jumps,

while the single prime indicates a worldline that shares no jumps with its double primed neighboring

worldline.

There are many possible paths to choose from for removing double jumps in the i-th world-

line. Due to the form of the Hamitonian (6), changing the placement of a double jump along

the imaginary-time direction has no eﬀect on the energy of the eﬀective classical system, except

possibly for the part which arises from the diagonal part of the Hamiltonian. Similar reasoning

holds for the ordering of the single and double jumps; changing the ordering can only change the

eﬀective classical energy through the diagonal part of the Hamiltonian. Similarly, changing the

imaginary-time position of the double jumps in ¯x

i−1

, ¯x

i

will have no eﬀect on the contributions to

the eﬀective classical energy that comes from double jumps occuring in ¯x

i−2

, ¯x

i−1

. Finally, note

that changing the position of a double jump along the imaginary-time direction can only aﬀect

the classical energy through the diagonal part of the Hamiltonian, reducing the probability of the

conﬁguration by at most e

−4β

(the factor of 4 is due to the potential change from the diagonal part

of the Hamiltonian when 3 worldlines are changed arbitrarily).

19

From the above considerations it follows that the imaginary-time position of the double jumps in

a particular worldline can be changed freely with the stationary weight of the resulting conﬁguration

decreased by at most a factor of e

−4β

(the factor of 4 bounds the potential change from the

diagonal part of the Hamiltonian when 3 worldlines are changed arbitrarily). Furthermore, since

double jumps arising from the term σ

x

i

σ

x

i+1

are simply bit-ﬂips, two such jumps will cancel out

(“annihilate”) if they coincide. This allows for paths that remove these double jumps in pairs simply

by moving them around, with the additional guarantee that this will not lower the stationary weight

by more than e

−4β

. Therefore the path from [¯z

1

, . . . , ¯z

n

] to [¯z

1

, . . . , ¯z

j−2

, ¯z

′

j−1

, ¯z

′′

j

, ¯z

′

j+1

, ¯z

j+2

, . . . , ¯z

n

]

is deﬁned to remove double jumps in pairs by shifting the position of the double jump at the least

imaginary-time position along the imaginary-time direction until it encounters the next double

jump and the pair is annihilated.

The canonical path from x to y proceeds through the worldlines in order 1, . . . , n, prepping

each pair of consecutive worldlines to remove double jumps, then updating the j-th worldline from

¯x

′′

j

to ¯y

′′

j

. The full update of the ﬁrst worldline proceeds as,

[¯x

1

, ¯x

2

, . . . , ¯x

n

] → [¯x

′′

1

, ¯x

′

2

, ¯x

3

, . . . , ¯x

n

] → [¯y

′′

1

, ¯x

′

2

, ¯x

3

, . . . , ¯x

n

] → [¯y

′′

1

, ¯x

′′

2

, ¯x

′

3

, ¯x

4

, . . . , ¯x

n

]

→ [¯y

′′

1

, ¯y

′′

2

, ¯x

′

3

, ¯x

4

, . . . , ¯x

n

] → [¯y

1

, ¯y

′

2

, ¯x

′

3

, ¯x

4

, . . . , ¯x

n

].

The purpose of this series of steps (each of which consists of many single-site updates) is to avoid

passing through intermediate conﬁgurations with too many additional double jumps occuring across

the 1D domain wall formed near the worldlines which are being updated. The worst case interme-

diate conﬁgurations along the paths will have the form,

z = [¯y

1

, . . . , ¯y

j−2

, ¯y

′

j−1

, (y

′′

1,j

, . . . , y

′′

r,j

, x

′′

r+1,j

, . . . , x

′′

L,j

), ¯x

′

j+1

¯x

j+2

, . . . , ¯x

n

]. (81)

This system of paths is highly degenerate (a particular conﬁguration such as (81) will appear in

many paths). To bound the congestion we will use an encoding function η

(z,z

′

)

: P(z, z

′

) → Ω × Θ.

If η

(z,z

′

)

(γ

xy

) = (η, θ), then

η = [¯x

1

, . . . , ¯x

j−2

, ¯x

′

j−1

, (x

′′

1,j

, . . . , x

′′

r,j

, y

′′

r+1,j

, . . . , y

′′

L,j

), ¯y

′

j+1

, ¯y

j+2

, . . . , ¯y

n

]. (82)

and θ contains all of the information needed to reconstruct ¯x

j−1

, ¯x

j

, ¯x

j+1

, ¯y

j−1

, ¯y

j

, ¯y

j+1

from ¯x

′

j−1

, ¯x

′′

j

,

¯x

′

j+1

, ¯y

′

j−1

, ¯y

′′

j

, ¯y

′

j+1

. This lost information must specify the location of all of the jumps that have

been removed from these 6 worldlines. This means that the encoding sends exponentially paths to

each state, and so to obtain a polynomial time mixing bound we have to slightly generalize (63)

to account for the fact that the stationary probability on the endpoints of most of these paths is

suﬃciently small.

For any v ∈ Ω deﬁne d

j

(v) := d(¯v

j−1

, ¯v

j

) + d(¯v

j

, ¯v

j+1

) (which is just the total number of double

jumps in the j-th worldline of v). Let q be the number of additional double jumps in the j-th

worldlines of x and y which are not present in the j-th worldlines of z and η,

q = d

j

(x) + d

j

(y) − d

j

(z) − d

j

(η) (83)

Deﬁne P

q

z,z

′

to be the subset of canonical paths passing through (z, z

′

) with endpoints x and y that

satisfy (83). With these deﬁnitions the congestion (59) becomes

R ≤ max

(z,z

′

)

1

π(z)P (z, z

′

)

L



q=0



γ

xy

∈P

q

z,z

′

π(x)π(y), (84)

20

Let Θ = ∪

6L

q=1

Θ

q

where G(q) = |Θ

q

| =



6L

q



is the number of ways to restore q jumps to the

undetermined worldlines. Now suppose we ﬁnd M(q) such that

π(x)π(y) ≤ M(q) · π(z)π(η), (85)

for all γ

xy

∈ P

q

z,z

′

, then this implies

R ≤ P

−1

min

L



q=0

M(q)



γ

xy

∈P

q

z,z

′

π(η) (86)

≤ P

−1

min

L



q=0

M(q)G(q)



η∈Ω

π(η) (87)

= P

−1

min

L



q=0

M(q)G(q) (88)

To satisfy (85) we need M(q) to satisfy

M(q) ≥

π(x)π(y)

π(z)π(η)

(89)

=

π([¯x

j−2

, ¯x

j−1

, ¯x

j

, ¯x

j+1

, ¯x

j+2

])π([¯y

j−2

, ¯y

j−1

, ¯y

j

, ¯y

j+1

, ¯y

j+2

])

π([¯y

j−2

, ¯y

′

j−1

, ¯x

′

j

+ ¯y

′

j

, ¯x

′

j+1

, ¯x

j+2

])π([¯x

j−2

, ¯x

′

j−1

, ¯x

′

j

+ ¯y

′

j

, ¯y

′

j+1

, ¯y

j+2

])

(90)

≥ e

4β

π([¯x

j−1

, ¯x

j

, ¯x

j+1

])π([¯y

j−1

, ¯y

j

, ¯y

j+1

])

π([¯y

′

j

, ¯x

′

j

+ ¯y

′

j

, ¯x

′

j+1

])π([¯x

′

j−1

, ¯x

′

j

+ ¯y

′

j

, ¯y

′

j+1

])

(91)

where (90) to (91) follows because the the terms in the classical energy function which correspond

to oﬀ-diagonal parts of the Hamiltonian involving qubits j − 2 and j + 2 are unaﬀected by the

potentially changed spin values in the primed j − 1 and j + 1 worldlines. The factor of e

4β

comes

from upper bounding the change in the classical energy function due to the diagonal terms of the

Hamiltonian between qubits j − 2, j − 1 and j + 1, j + 2. The eﬀect of the diagonal part of the

Hamiltonian on the ratio expression remaining in (91) can be bounded by a factor of e

8β

. The

additional single jumps introduced near the domain wall can increase the ratio by at most L

12

(using (67), and noting that the worst case occurs when 2 new single jumps are introduced in

each of 3 worldlines, in each of z and η). Each of the double jumps in the numerator which are

not present in the denominator decreases the ratio by at least (2β/L). Therefore we may take

M(q) = e

12β

(2β)

q

L

−q

and (88) becomes

R ≤ e

12β

L

12

· P

−1

min

L



q=0



6L

q





2β

L



q

≤ L

12

· e

20β

· P

−1

min

. (92)

Using the fact that P

−1

min

is poly(n, β, Γ

−1

) from (67), and the fact that the maximum length of the

canonical paths used in this section is O(n

2

L

2

), this shows that τ

mix

is poly(n, e

β

, Γ

−1

) as required

to complete the proof of theorem 1.

21

4.5 Quasipolynomial time mixing in theorem 1

In this section we also work with a restricted state space Ω

∗

, which is deﬁned as the set of conﬁg-

urations with fewer than 2B := 4β log(n) jumps in each of the worldlines,

Ω

∗

= {z = [¯z

1

, . . . , ¯z

n

] ∈ Ω : |{i : z

i,j

̸= z

i+1,j

}| ≤ 2B for j = 1, . . . , n}. (93)

In Section 3.4 we show that Z

∗

=



z∈Ω

∗

e

−βE

β

(z)

satisﬁes |Z −Z

∗

| ≤ δZ. The PIMC dynamics can

be restricted to Ω

∗

by rejecting moves which would leave Ω

∗

, or alternatively the canonical paths in

this section bound the mixing time within Ω

∗

by the leaky random walk conductance bound in [15].

To show rapid mixing of PIMC within Ω

∗

we assign a canonical path γ

xy

to every x, y ∈ Ω

∗

. The

simple case for these paths occurs when x, y are in a subset Ω

∗

inner

⊆ Ω

∗

which only allows half as

many jumps per worldline as Ω

∗

does,

Ω

∗

inner

= {z = [z

1

, . . . , z

n

] ∈ Ω : |{i : z

i,j

̸= z

i+1,j

}| ≤ B for j = 1, . . . , n}. (94)

If x, y ∈ Ω

∗

inner

then we use the same paths which were used to show rapid mixing for TIM in

Section 4.3. Various intermediate points v along this path γ

x,y

may leave Ω

∗

inner

but will remain in

Ω

∗

since

v = (y

1,1

, . . . , y

L,1

, . . . , y

L−r−1,m

, x

L−r,m

, x

L−r+1,m

. . . , x

1,n

, . . . , x

L,n

), (95)

may have as many as B jumps in the ﬁrst r time slices of ¯y

m

and B jumps in the L −r latter time

slices of ¯x

m

, so ¯v

m

will have at most 2B jumps in total and so v ∈ Ω

∗

.

For points x, y ∈ Ω

∗

which are not in Ω

∗

inner

, we must not produce intermediate points with

too many jumps in any worldline. To accomplish this the paths will have a “clean up” step similar

to the one which was used in the previous section, to reduce the number of 2-local jumps in a

worldline. Consider an intermediate point of the path which has so far updated the worldlines of

the ﬁrst m qubits (which could be m = 0), v = [¯y

1

, . . . , ¯y

m

, ¯x

m+1

, . . . , ¯x

n

]. If ¯x

m+1

has more than B

double jumps then we remove these in imaginary-time order. Since we can no longer “annihilate”

these transitions in pairs as was done in the previous section, we instead dismantle them into 1-local

jumps (relying on the ﬁctitious transverse ﬁeld) and then annihilate those against one another. The

eﬀect of this process on the stationary weight of the conﬁguration will be analyzed after we have

given the encoding function for these paths. The net result of this process results is a worldline

¯x

′

m+1

with fewer than B jumps. Similarly let ¯y

′

m+1

be deﬁned from ¯y

m+1

with all but the ﬁrst B

jumps removed. Next we update ¯x

′

m+1

to ¯y

′

m+1

in the usual way, and since ¯x

′

m+1

and ¯y

′

m+1

each

have fewer than B jumps, the intermediate conﬁgurations along this path will have less than 2B

jumps in every worldline and therefore remain in Ω

∗

.

The encoding η : P

(z,z

′

)

→ Ω

∗

× Θ for these paths is again deﬁned by (82). This time the

maximum degeneracy of the encoding is



4L

B



, since there are at most 4L positions to which at

most B 2-local jumps can be restored in the undetermined worldlines. Unfortunately, this time we

cannot have a inequality analogous to (85), because the additional double jumps present in x and

y can completely change the bit-values in wordlines j −1 and j + 1. Unlike the case in the previous

section these altered spin values can reduce the classical energy drastically because of the h terms

acting on worldlines j − 2, j −1 and j + 1, j + 2. The stationary weight of z and η can be lowered

from x and y by a factor which scales exponentially in the number of double jumps that x and y

22

have in worldlines j − 2, j − 1 and j + 1, j + 2. Therefore,

M ≥

π(x)π(y)

π(z)π(η)

(96)

=

π([¯x

i−2

, ¯x

i−1

, ¯x

i

, ¯x

i+1

, ¯x

i+2

])π([¯y

i−2

, ¯y

i−1

, ¯y

i

, ¯y

i+1

, ¯y

i+2

])

π([¯y

i−2

, ¯y

′

i−1

, ¯x

′

i

+ ¯y

′

i

, ¯x

′

i+1

, ¯x

i+2

])π([¯x

i−2

, ¯x

′

i−1

, ¯x

′

i

+ ¯y

′

i

, ¯y

′

i+1

, ¯y

i+2

])

(97)

≥ e

4β

·



βΓ

L



−4B

π([¯x

i−1

, ¯x

i

, ¯x

i+1

])π([¯y

i−1

, ¯y

i

, ¯y

i+1

])

π([¯y

′

i−1

, ¯x

′

i

+ ¯y

′

i

, ¯x

′

i+1

])π([¯x

′

i−1

, ¯x

′

i

+ ¯y

′

i

, ¯y

′

i+1

])

(98)

≥ e

12β

·



βΓ

L



−8B

, (99)

The ﬁctitious transverse ﬁeld has magnitude δ

mult

/n, so the congestion satisﬁes

R ≤ P

−1

min

M · G (100)

≤ P

−1

min

· e

12β

·



poly(n) · δ

−1

mult



8βΓ log n

·



4L

4βΓ log n



, (101)

from which it follows that the mixing time τ

mix

is O(2

β(log nδ

−1

mult

)

2

) and we obtain the overall PIMC

runtime as claimed in theorem 1.

Acknowledgments

Part of this work was completed while EC was funded by the Institute for Quantum Information

and Matter, an NSF Physics Frontiers Center (NSF Grant PHY-1125565) with support of the

Gordon and Betty Moore Foundation (GBMF-12500028). AWH was funded by NSF grants CCF-

1452616, CCF-1629809, CCF-1729369 and ARO contract W911NF-17-1-0433. The research is

based upon work partially supported by the Oﬃce of the Director of National Intelligence (ODNI),

Intelligence Advanced Research Projects Activity (IARPA), via the U.S. Army Research Oﬃce

contract W911NF-17-C-0050. The views and conclusions contained herein are those of the authors

and should not be interpreted as necessarily representing the oﬃcial policies or endorsements,

either expressed or implied, of the ODNI, IARPA, or the U.S. Government. The U.S. Government

is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any

copyright annotation thereon.

References

[1] Tameem Albash and Daniel A. Lidar. Adiabatic quantum computation. Rev. Mod. Phys., 90:

015002, Jan 2018. DOI: 10.1103/RevModPhys.90.015002.

[2] Tameem Albash, Gene Wagenbreth, and Itay Hen. Oﬀ-diagonal expansion quantum monte

carlo. Physical Review E, 96(6):063309, 2017. DOI: 10.1103/PhysRevE.96.063309.

[3] Evgeny Andriyash and Mohammad H Amin. Can quantum Monte Carlo simulate quantum

annealing?, 2017, arXiv:1703.09277.

[4] Itai Arad, Zeph Landau, Umesh Vazirani, and Thomas Vidick. Rigorous RG algorithms and

area laws for low energy eigenstates in 1D. Communications in Mathematical Physics, 356(1):

65–105, 2017. DOI: 10.1007/s00220-017-2973-z.

23

[5] Ivona Bez´akov´a, Daniel

ˇ

Stefankoviˇc, Vijay V. Vazirani, and Eric Vigoda. Accelerating sim-

ulated annealing for the permanent and combinatorial counting problems. SIAM Journal on

Computing, 37(5):1429–1454, 2008. DOI: 10.1137/050644033.

[6] Lucas T Brady and Wim van Dam. Quantum Monte Carlo simulations of tunneling in quantum

adiabatic optimization. Physical Review A, 93(3):032304, 2016, arXiv:1509.02562. DOI:

10.1103/PhysRevA.93.032304.

[7] Lucas T Brady and Wim van Dam. Spectral-gap analysis for eﬃcient tunneling in quantum

adiabatic optimization. Physical Review A, 94(3):032309, 2016, arXiv:1601.01720. DOI:

10.1103/PhysRevA.94.032309.

[8] Sergey Bravyi. Monte Carlo simulation of stoquastic Hamiltonians. Quantum Information &

Computation, 15(13-14):1122–1140, 2015, arXiv:1402.2295. DOI: 10.5555/2871363.2871366.

[9] Sergey Bravyi and David Gosset. Polynomial-time classical simulation of quantum ferromag-

nets. Physical review letters, 119(10):100503, 2017, arXiv:1612.05602. DOI: 10.1103/Phys-

RevLett.119.100503.

[10] Sergey Bravyi and Matthew Hastings. On complexity of the quantum Ising model. Communi-

cations in Mathematical Physics, 349(1):1–45, 2017, arXiv:1410.0703. DOI: 10.1007/s00220-

016-2787-4.

[11] Sergey Bravyi and Barbara M. Terhal. Complexity of stoquastic frustration-free Hamiltonians.

SIAM J. Comput., 39(4):1462–1485, 2009, arXiv:0806.1746. DOI: 10.1137/08072689X.

[12] Sergey Bravyi, David P. DiVincenzo, Roberto I. Oliveira, and Barbara M. Terhal. The com-

plexity of stoquastic local Hamiltonian problems. Quant. Inf. Comp., 8(5):0361–0385, 2006,

arXiv:quant-ph/0606140. DOI: 10.5555/2011772.2011773.

[13] Jacob Bringewatt, William Dorland, Stephen P. Jordan, and Alan Mink. Diﬀusion monte carlo

approach versus adiabatic computation for local hamiltonians. Phys. Rev. A, 97:022323, Feb

2018. DOI: 10.1103/PhysRevA.97.022323.

[14] Alessandra Cipriani and Paolo Dai Pra. Decay of correlations for quantum spin systems with

a transverse ﬁeld: A dynamic approach, 2010, arXiv:1005.3547.

[15] Elizabeth Crosson and Aram W Harrow. Simulated quantum annealing can be exponentially

faster than classical simulated annealing. In Foundations of Computer Science (FOCS), 2016

IEEE 57th Annual Symposium on, pages 714–723. IEEE, 2016, arXiv:1601.03030. DOI:

10.1109/FOCS.2016.81.

[16] Elizabeth Crosson, Edward Farhi, Cedric Yen-Yu Lin, Han-Hsuan Lin, and Peter

Shor. Diﬀerent strategies for optimization using the quantum adiabatic algorithm, 2014,

arXiv:1401.7320.

[17] Toby S. Cubitt, Ashley Montanaro, and Stephen Piddock. Universal quantum hamil-

tonians. Proceedings of the National Academy of Sciences, 115(38):9497–9502, 2018,

arXiv:1701.05182. ISSN 0027-8424. DOI: 10.1073/pnas.1804949115. URL https://www.

pnas.org/content/115/38/9497.

[18] Ky Fan. Maximum properties and inequalities for the eigenvalues of completely continuous

operators. Proceedings of the National Academy of Sciences, 37(11):760–766, 1951. ISSN

0027-8424. DOI: 10.1073/pnas.37.11.760.

[19] Edward Farhi, Jeﬀrey Goldstone, Sam Gutmann, Joshua Lapan, Andrew Lundgren, and

Daniel Preda. A quantum adiabatic evolution algorithm applied to random instances of an

np-complete problem. Science, 292(5516):472–475, 2001. DOI: 10.1126/science.1057726.

24

[20] Edward Farhi, Jeﬀrey Goldstone, and Sam Gutmann. Quantum adiabatic evolution algorithms

versus simulated annealing, 2002, arXiv:quant-ph/0201031.

[21] WMC Foulkes, L Mitas, RJ Needs, and G Rajagopal. Quantum monte carlo simulations of

solids. Reviews of Modern Physics, 73(1):33, 2001. DOI: 10.1103/RevModPhys.73.33.

[22] M. B. Hastings. Quantum adiabatic computation with a constant gap is not useful in one

dimension. Phys. Rev. Lett., 103:050502, 2009, arXiv:0902.2960. DOI: 10.1103/Phys-

RevLett.103.050502.

[23] Matthew B Hastings and MH Freedman. Obstructions to classically simulating the quan-

tum adiabatic algorithm. Quantum Information & Computation, 13(11-12):1038–1076, 2013,

arXiv:1302.5733. DOI: 10.5555/2535639.2535647.

[24] Wassily Hoeﬀding. Probability inequalities for sums of bounded random vari-

ables. Journal of the American Statistical Association, 58(301):13–30, 1963. DOI:

10.1080/01621459.1963.10500830.

[25] Layla Hormozi, Ethan W. Brown, Giuseppe Carleo, and Matthias Troyer. Nonstoquastic

hamiltonians and quantum annealing of an ising spin glass. Phys. Rev. B, 95:184416, May

2017. DOI: 10.1103/PhysRevB.95.184416.

[26] Michael Jarret and Brad Lackey. Substochastic Monte Carlo algorithms, 2017,

arXiv:1704.09014.

[27] Michael Jarret, Stephen P Jordan, and Brad Lackey. Adiabatic optimization versus diﬀusion

monte carlo methods. Physical Review A, 94(4), 2016, arXiv:1607.03389. DOI: 10.1103/Phys-

RevA.94.042318.

[28] Mark Jerrum and Alistair Sinclair. Polynomial-time approximation algorithms for the Ising

model. SIAM Journal on Computing, 22:1087–1116, 1993. DOI: 10.1137/0222066.

[29] Mark Jerrum and Alistair Sinclair. The Markov chain Monte Carlo method: An approach to

approximate counting and integration. In Dorit S. Hochbaum, editor, Approximation Algo-

rithms for NP-hard Problems, pages 482–520. PWS Publishing Co., Boston, MA, USA, 1997.

ISBN 0-534-94968-1. DOI: 10.5555/241938.241950.

[30] Zhang Jiang, Vadim N Smelyanskiy, Sergei V Isakov, Sergio Boixo, Guglielmo Mazzola,

Matthias Troyer, and Hartmut Neven. Scaling analysis and instantons for thermally assisted

tunneling and quantum Monte Carlo simulations. Physical Review A, 95(1):012322, 2017,

arXiv:1603.01293. DOI: 10.1103/PhysRevA.95.012322.

[31] Tadashi Kadowaki and Hidetoshi Nishimori. Quantum annealing in the transverse Ising

model. Phys. Rev. E, 58:5355–5363, Nov 1998, arXiv:cond-mat/9804280. DOI: 10.1103/Phys-

RevE.58.5355.

[32] Linghang Kong and Elizabeth Crosson. The performance of the quantum adiabatic algorithm

on spike Hamiltonians. Int. J. Quantum Inform., 15(1750011), 2015, arXiv:1511.06991. DOI:

10.1142/S0219749917500113.

[33] Zeph Landau, Umesh Vazirani, and Thomas Vidick. A polynomial-time algorithm for the

ground state of 1D gapped local Hamiltonians. Nature Physics, 2013, arXiv:1307.5143.

DOI: 10.1038/nphys3345.

[34] D.A. Levin, Y. Peres, and E.L. Wilmer. Markov Chains and Mixing Times. American Math-

ematical Soc., 2008. ISBN 9780821886274. DOI: 10.1090/mbk/107.

[35] Fabio Martinelli and Marc Wouts. Glauber dynamics for the quantum Ising model in a

transverse ﬁeld on a regular tree. Journal of Statistical Physics, 146(5):1059–1088, 2012,

arXiv:1105.5970. DOI: 10.1007/s10955-012-0436-7.

25

[36] Roman Marton´ak, Giuseppe E. Santoro, and Erio Tosatti. Quantum annealing by the path-

integral monte carlo method: The two-dimensional random ising model. Phys. Rev. B, 66:

094203, Sep 2002. DOI: 10.1103/PhysRevB.66.094203.

[37] Siddharth Muthukrishnan, Tameem Albash, and Daniel A Lidar. Tunneling and speedup in

quantum optimization for permutation-symmetric problems. Physical Review X, 6(3):031010,

2016. DOI: 10.1103/PhysRevX.6.031010.

[38] Ben W Reichardt. The quantum adiabatic optimization algorithm and local minima. In

Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, pages 502–

510. ACM, 2004. DOI: 10.1145/1007352.1007428.

[39] Anders W Sandvik. Stochastic series expansion method with operator-loop update. Physical

Review B, 59(22):R14157, 1999. DOI: 10.1103/PhysRevB.59.R14157.

[40] Lorenzo Stella and Giuseppe E Santoro. Quantum annealing of an Ising spin-glass by Green’s

function Monte Carlo. Physical Review E, 75(3):036703, 2007, arXiv:cond-mat/0608420.

DOI: 10.1103/PhysRevE.75.036703.

[41] Yuki Susa, Yu Yamashiro, Masayuki Yamamoto, and Hidetoshi Nishimori. Exponential

speedup of quantum annealing by inhomogeneous driving of the transverse ﬁeld. Journal

of the Physical Society of Japan, 87(2):023002, 2018. DOI: 10.7566/JPSJ.87.023002.

[42] Masuo Suzuki. Quantum statistical Monte Carlo methods and applications to spin sys-

tems. Journal of Statistical Physics, 43(5-6):883–909, 1986. ISSN 0022-4715. DOI:

10.1007/BF02628318.

[43] Masuo Suzuki, Seiji Miyashita, and Akira Kuroda. Monte Carlo simulation of quantum spin

systems. Prog. Theor. Phys., 58(5):1377–1387, 1977. DOI: 10.1143/PTP.58.1377.

[44] S. Suzuki, J. Inoue, and B.K. Chakrabarti. Quantum Ising Phases and Transitions in Trans-

verse Ising Models. Lecture notes in physics. Springer, 2013. ISBN 9783642330391. DOI:

10.1007/978-3-642-33039-1.

[45] Frank Verstraete, Valentin Murg, and J Ignacio Cirac. Matrix product states, projected entan-

gled pair states, and variational renormalization group methods for quantum spin systems. Ad-

vances in Physics, 57(2):143–224, 2008, arXiv:0907.2796. DOI: 10.1080/14789940801912366.

[46] Walter Vinci and Daniel A Lidar. Non-stoquastic hamiltonians in quantum annealing via

geometric phases. npj Quantum Information, 3(1):1–6, 2017. DOI: 10.1038/s41534-017-0037-

z.

26

Comments

Products

Project