There are a lot of algorithms that can prove mathematical identitie...
There are many representations of $e$ in this link: [List of repres...
Read more about this algorithm here: [Meet in the middle](https://w...
Gradient descent (a.k.a steepest descent) is one of the best-known ...
Euler's equation is often called the most beautiful equation in mat...
To see the hydrogen spectral lines Rydberg was observing, see this ...
It can be shown that the digits of an irrational number do not term...
One can read more about generalized continued fractions, including ...
Super-exponential means that it grows even faster than an exponenti...
Since these are new results (conjectures) if you are able to prove ...
Fermat’s last theorem (a.k.a Fermat’s conjecture) states that no th...
Another cool example of such a collaborative project is SETI, searc...
We believe that sometimes the road can be just as important and ill...
This is probably one of the most important comments in this work an...
The Riemann hypothesis (a.k.a Riemann conjecture) is probably consi...
This figure shows one of the most powerful outcomes of the result. ...
Apery’s constant is the value of the Riemann zeta function at zeta=...
The idea here is that just like mathematicians study a lot of formu...
There are so many important constants in mathematics and other fiel...
The Ramanujan Machine: Automatically Generated
Conjectures on Fundamental Constants
Gal Raayoni
1
, George Pisha
1
, Yahel Manor
1
, Uri Mendlovic
2
, Doron Haviv
1
, Yaron
Hadad
1
, and Ido Kaminer
1
1
Technion - Israel Institute of Technology, Haifa 3200003, Israel
2
Google Inc., Tel Aviv 6789141, Israel
Abstract
Fundamental mathematical constants like
e
and
π
are ubiquitous in diverse fields of science, from
abstract mathematics and geometry to physics, biology and chemistry. Nevertheless, for centuries
new mathematical formulas relating fundamental constants have been scarce and usually discovered
sporadically. In this paper we propose a novel and systematic approach that leverages algorithms
for deriving new mathematical formulas for fundamental constants and help reveal their underlying
structure. Our algorithms find dozens of well-known as well as previously unknown continued fraction
representations of
π
,
e
, and the Riemann zeta function values. Two new conjectures produced by our
algorithm, along with many others, are:
e = 3 +
1
4 +
2
5+
3
6+
4
7+...
,
4
π 2
= 3 +
1 · 3
5 +
2·4
7+
3·5
9+
4·6
11+...
We present two algorithms that proved useful in finding new results: a variant of the Meet-In-
The-Middle (MITM) algorithm and a Gradient Descent (GD) tailored to the recurrent structure
of continued fractions. Both algorithms are based on matching numerical values and thus find
new conjecture formulas without providing proofs and without requiring prior knowledge on any
mathematical structure. This approach is especially attractive for fundamental constants for which no
mathematical structure is known, as it reverses the conventional approach of sequential logic in formal
proofs. Instead, our work presents a new conceptual approach for research: computer algorithms
utilizing numerical data to unveil new internal structures and conjectures, thus playing the role of
mathematical intuition of great mathematicians of the past, providing leads to new mathematical
research.
Code available at: http://www.ramanujanmachine.com/ or https://github.com/AnonGit90210/RamanujanMachine.
1
1 Introduction
Fundamental mathematical constants such as
e
,
π
, the golden ratio
ϕ
, and many others play an instrumental
part in diverse fields such as geometry, number theory, calculus, fundamental physics, biology, and ecology
[
1
]. Throughout history simple formulas of fundamental constants symbolized simplicity, aesthetics, and
mathematical beauty. A couple of well-known examples include Euler’s identity
e
+ 1 = 0 or the
continued fraction representation of the Golden ratio:
ϕ =
1
1 +
1
1+
1
1+...
. (1)
The discovery of such Regular Formulas (RFs)
1
was often sporadic and considered an act of math-
ematical ingenuity or profound intuition. One prominent example is Gauss’ ability to see meaningful
patterns in numerical data that led to new fields of analysis such as elliptic and modular functions and
to the hypothesis of the Prime Number Theorem. He is even famous for saying “I have the result, but I
do not yet know how to get it” [
2
], which emphasizes the role of identifying patterns and RFs in data as
enabling acts of mathematical discovery.
In a different field but in a similar manner, Johannes Rydberg’s discovery of his formula of hydrogen
spectral lines [
3
], resulted from his data analysis of the spectral emission by chemical elements:
λ
1
=
R
H
(
n
2
1
n
2
2
), where
λ
is the emission wavelength,
R
H
is the Rydberg constant,
n
1
and
n
2
are the
upper and lower quantum energy levels respectively. This insight, emerging directly from identifying
patterns in the data, had profound implications on modern physics and quantum mechanics.
Unlike measurements in physics and all other sciences,
mathematical
constants can be calculated
to an arbitrary precision (number of digits) with an appropriate formula, thus providing an
absolute
ground truth
. In this sense, mathematical constants contain an unlimited amount of data (e.g. the
infinite sequence of digits in an irrational number), which we propose to use as a ground truth for finding
new RFs. Since the fundamental constants are universal and ubiquitous in their applications, finding such
patterns can reveal new mathematical structures with broad implications, e.g. the Rogers-Ramanujan
continued fraction (which has implications on modular forms) and the Dedekind η and j functions [4, 5].
Consequently, having a
systematic
method to derive new RFs can help research in many fields of science.
In this paper, we establish a novel method to learn mathematical relations between constants and we
present a list of new conjectures found using this method. While the method can be leveraged for many
forms of RFs, we demonstrate its potential with equations of the form of generalized continued fractions
(GCFs):
x = a
0
+
b
1
a
1
+
b
2
a
2
+
b
3
a
3
+...
, (2)
where
a
n
, b
n
Z
for
n
= 1
,
2
, . . .
are partial numerators and denominators respectively. GCFs in which
the partial numerators and denominators follow a closed-form expression like a polynomial have been of
interest to mathematicians for centuries and still are today, e.g. William Broucker’s π representation [6]
or [1, 7, 8].
We demonstrate our approach by finding identities between a GCF and the value of a rational function
at a fundamental constant. For simplicity, enumeration and expression aesthetics, we limit ourselves to
integer polynomials on both sides of the equality. We propose two search algorithms. The first algorithm
uses the Meet-In-The-Middle (MITM) algorithm to a relatively small precision in order to reduce the
search space and eliminate mismatches. It increases the precision with a larger number of GCF iterations
on the remaining hits to validate them as new conjectured RFs, and is therefore called MITM-RF. The
second algorithm uses an optimization-based method, which we call Descent&Repel, converging to integer
lattice points that define new conjectured RFs.
1
By regular formulas we refer to any mathematical expression or equality that is infinite in nature but can be encapsulated
using a finite expression.
2
Our MITM-RF algorithm was able to produce several novel conjectures, for example:
Conjectures 1-4:
Sample of automatically generated conjectures for mathematical formulas of funda-
mental constants, as generated by our proposed Ramanujan Machine by applying the MITM-RF algorithm.
All these results are previously unknown conjectures to the best of our knowledge. Both results for
π
converge exponentially and both results for
e
converge super-exponentially. See Table 3 in the Appendix for
additional results from our algorithms along with their convergence rates, which we separate to previously
known formulas and new formulas.
One may wonder whether the conjectures discovered by this work are indeed mathematical identities or
merely mathematical coincidences that breakdown once enough digits are calculated. However, the method
employed in this work makes it fairly unlikely for the conjectures to breakdown. For an enumeration space
of 10
9
and result accuracy of more than 50 digits, the probability of finding a random match is smaller
than 10
40
. This minuscule probability makes us believe that the new conjectures are truths awaiting a
rigorous proof by the mathematical community. In the past, the development of such proofs led to new
discoveries, such as the consequences on number theory of the proof of Fermat’s last theorem [
9
]. We
believe and hope that proofs of these new conjectures will lead to new discoveries in the future.
After discovering dozens of GCFs we observed empirically that there is a relationship between the
ratio of the polynomial order of
a
n
and
b
n
and the rate at which the formula converges as a function of
the number of iterations. This relationship was also proven rigorously in the Appendix.
In contrast to the method we present, many known RFs of fundamental constants were discovered by
conventional mathematical proofs, i.e. sequential logical steps derived from known properties of these
constants. For example, several RFs of
e
and
π
were generated using the Taylor expansion of the exponent
and the trigonometric functions and using Euler’s continued fraction formula [
10
], which connects an
infinite sum and an infinite GCF. In our work, we aim to reverse this process, finding new RFs for
the fundamental constants using their numerical data alone, without any prior knowledge about their
mathematical structure. Each RF may enable reverse-engineering of the mathematical structure that
produces the RF, and provide new insight on the field. Our approach is especially powerful in cases of
empirical constants, such as the Feigenbaum constant from chaos theory (Table 2), which are derived
numerically from simulations and have no analytic representation.
Given the success of our approach in finding new RFs, there are many additional avenues for more
advanced algorithms and future research. Inspired by worldwide collaborative efforts in mathematics such as
the Great Internet Mersenne Prime Search (GIMPS) we launch the initiative
www.RamanujanMachine.com
,
dedicated for finding new RFs for fundamental constants. The general community can donate computational
time to find RFs, propose mathematical proofs for conjectured RFs, or suggest new algorithms for finding
3
them (see Appendix Section 4).
2 Related Work
Stated in an oversimplified manner, the process of mathematical research usually includes two main steps:
conjecturing and proving (as in Fig. 1). It is the latter step that was studied extensively in the computer
science literature and is known as Automated Theorem Proving (ATP), which focuses on proving
existing
conjectures
- fundamentally different from our work that focuses on generating
new conjectures
. In
ATP, algorithms already proved many theorems [
11
], most notably the Four Color Theorem [
12
], the
Lorenz attractor problem [
13
], the Keppler Conjecture on the density of sphere packing [
14
], as well as
proving a conjectured identity for
ζ
(4) [
15
], and recent results [
16
,
17
]. In contrast, it is the "conjecturing"
step of the process of mathematical research that is the focus of this paper: automatic conjecturing of
RFs.
Figure 1:
Conceptual flow of our Ramanujan Machine. First, through various approaches of pattern
learning & generalization (Section 5.1), we can generate a space of RF conjectures, e.g. GCFs. We
then apply a search algorithm, validate potential conjectures, and remove redundant results. Finally,
the validated results form mathematical conjectures that need to be proven analytically, thus closing a
complete research endeavour from pattern generation to a proof and potentially a new mathematical
insight.
Proposing conjectures is often times more significant than proving them. For this reason some of the
most original mathematicians and scientists are known for their famous
unsolved conjectures
rather
than for their solutions to other problems, like Fermat’s last theorem, Hilbert’s problems, Landau’s
problems, Hardy-Littlewood prime tuple conjecture, Birch-Swinnerton-Dyer conjecture, and of course
the Riemann Hypothesis [
9
,
18
,
19
,
20
,
21
]. Maybe the most famous example is Ramanujan, who posed
dozens of conjectures involving fundamental constants and considered them to be revelations from one of
his goddesses [
22
]. In our work,
we aim to automate the process of conjectures generation
. We
demonstrate this concept by providing new conjectures for fundamental constants.
3 The Meet-In-The-Middle-RF Algorithm
Given a fundamental constant c (e.g. c = π), our goal is to learn a set of four polynomials (α,β,γ,δ):
γ(c)
δ(c)
= f
i
(GCF(α, β)) (3)
for
{f
i
}
a given set of functions (e.g.
f
1
(
x
) =
x, f
2
(
x
) =
1
x
, . . .
), where
GCF
(
α, β
) means the generalized
continued fraction with the partial numerator and denominator
a
n
=
α
(
n
) ;
b
n
=
β
(
n
) respectively as
defined in Eq. (2). α, β, γ and δ are integer polynomials.
As showcased in Fig. 2, we start by enumerating over the two sides of Eq.
(3)
and successively
4
Figure 2:
The Meet-In-The-Middle (MITM-RF) method: first we enumerate RHS to a low precision,
values are stored in a hash-table. Then we enumerate LHS to a low precision and search for matches.
The matches are reevaluated to a higher precision and compared again. The process is repeated until a
specified, arbitrary decimal precision is reached, thus reducing false positives. The final results are then
posed as new conjectures.
generating many different integer polynomials for
α, β, γ, δ
2
. We calculate the Right-Hand-Side (RHS) of
each instance up to a limited number of iterations and store the results in a hash-table. We continue by
evaluating the Left-Hand-Side (LHS) up to a pre-selected decimal point. We attempt to then locate each
result from the LHS in the hash-table with the RHS results, where successful attempts are considered as
candidate solutions, and will be referred to as "hits".
Since the LHS and RHS calculations are performed up to a limited precision, several of the hits are
bound to be false positives. We then eliminate these false positives by calculating the rational function to
an arbitrary precision to reduce the likelihood that the equality is coincidental as shown in Fig. 3.
A naive enumeration method is very computationally intensive with time complexity of
O
(
MN
), where
M
and
N
are the LHS and RHS space size respectively, and space complexity of
O
(1). Since calculating
the RHS is more computationally costly, we store the RHS in the hash-table in order to significantly
reduce computation time at the expense of space. This makes the algorithm’s time complexity
O
(
M
+
N
)
and its space complexity
O
(
N
). Moreover, the hash-table of the GCF (RHS) can be saved and reused for
further LHS enumerations, reducing future enumeration durations by a significant amount.
We also generalize the aforementioned algorithm to allow for
α
and
β
to be interlaced sequences, i.e.
they may consist of multiple integer polynomials. The most simple example of a non-trivial interlaced
sequence is a sequence for which even values of
n
are equal to one polynomial and odd values of
n
are
equal to a different polynomial. For results and details see Appendix Section 1 and the MITM-RF code
on www.RamanujanM achine.com.
2
We delete instances which produce trivial results like
γ
= 3
· δ
or
α
= 0 and instances whose
β
polynomial has zero roots,
which result in a finite GCF, necessarily representing a rational number.
5
Figure 3:
Convergence rate of the GCFs found by our method. Plots of the absolute difference between
the GCFs and the corresponding fundamental constant (i.e. the error) vs. the depth of the GCF. All the
results were found by our MITM-GCF algorithm. On the left are GCFs that converge exponentially/super-
exponentially (validated numerically), and on the right are GCFs which converge polynomially. The vast
majority of previously known results for
π
converge polynomially, while all of our newly found results
converge exponentially or super-exponentially.
With our proposed MITM algorithm we were able to discover new regular GCFs for fundamental
constants other than those previously known. However, seeing how successful our algorithm was despite
being relatively simplistic, we believe there is still ample room for new results, which should follow by
leveraging more sophisticated algorithms and more precise techniques, thus discovering hidden truths
about fundamental constants that may be considered to be more exotic than
π
and
e
perhaps with
formulas that are more complex than the GCFs that were used in this work.
3.1 Other Fundamental Constants with the MITM-RF
We also studied other fundamental constants of more exotic nature than
π
and
e
and found two new GCFs
for Apéry’s constant
ζ
(3) =
P
n=1
1
n
3
. Note that the MITM-RF algorithm does not need to use any prior
knowledge on the fundamental constant. However, there is a vast body of research on the properties of
many fundamental constants from which various structures can be inferred. Hence when aiming for such a
constant, one promising way to utilize such prior knowledge is to study other formulas of the fundamental
constant, in attempt to find a common element and use that as a prior for the MITM-RF algorithm.
Such an approach can reduce dramatically the enumeration space and the computational complexity, thus
improving the chance for finding possible solutions. As a proof of concept of this approach for the Apéry
constant, consider formulas 1 & 2.
6
Formulas 1 & 2
: New formulas for
ζ
(3) discovered using prior knowledge incorporated into MITM. The
top formula converges polynomially while to bottom converges exponentially. See Appendix Section 2 and
the attached code for details.
4 Descent&Repel
We propose a GD optimization method and demonstrate its success in finding RFs, and compare it with
the MITM-RF method. The MITM-RF method, although proved successful, is not trivially scalable. This
issue can be targeted by either a more sophisticated variant or by switching to an optimization based
method, as is done by the following algorithm.
As explained in Section 3, we want to find integral solutions to Eq.
(3)
. This can also be written as
the following constrained optimization problem:
minimize
α, β, γ, δ
L =
γ(π)
δ(π)
GCF(α, β)
.
subject to {α, β, γ, δ} Z[x]
(4)
Solving this optimization problem with GD appears implausible, since we are only satisfied with global
minima without any error and the solutions must be integers. However, we found an important feature
of the loss landscape of the described problem that allows it to be solved with a slightly modified GD
that we name ’Descent&Repel’ (Fig. 4, example of results in Table 1). The minima are not 0-dimensional
points but (
d
1)-dimensional manifolds with
d
being the number of optimization variables as would be
expected given the single constraint. Moreover, we observed empirically that all minima are global and
their errors are zero, therefore any GD process will result in a solution with
L
= 0. It is well known that
any real number can be expressed as a simple continued fraction [
23
], and the aforementioned feature
hints that this may also be true for GCFs with integer polynomials.
Convergence Known / New Formula Polynomials
Exponential known
4
π
= 1 +
1
3+
4
5+
9
7+...
a
n
= 1 + 2n, b
n
= n
2
Super-Exponential new e = 3 +
1
4+
2
5+
3
6+...
a
n
= 3 + n, b
n
= n
Table 1: RFs for π and e found in a proof-of-concept run of the Descent&Repel algorithm.
We chose the variables of the optimization problem as the coefficients of the
α, β, γ, δ
polynomials in
Eq.
(4)
. The optimization problem is initiated with a large set of points, specifically in the examples we
7
present all initial conditions were set on a line, as is showcased in Fig. 4. We iteratively perform GD for
each point and then force all points to repel from one another via a "Coulomb"-like repulsion. To find
integer solutions, we finalize our algorithm by GD steps toward the integer lattice and toward the minima
curve, thus returning only solutions that lie on the integer lattice.
Figure 4:
Schematic diagram of our Descent&Repel algorithm for finding new RFs for fundamental
constants, by relying on GD optimization. The key observation that enables this method is that all minima
are global (
L
= 0) and appear as (
d
1)-dimensional manifolds, where
d
is the number of optimization
variables. Starting with our initial conditions (in this example, consisting of 600 points) on a line in (a),
we perform ordinary GD alternating with "Coulomb" repulsion between all the points (b,c). Finally, to
arrive to grid points, we perform GD toward integral points (with the loss function
sin
2
(
πx
)) and toward
the minimum curves, alternately (d). Lastly, we check whether any point satisfies the equation.
5 Discussion
5.1 Hypotheses Generation
Our results so far point to new interesting questions and hypotheses about fundamental constants: For
example, we found many more continued fractions for
e
than for other constants we tested, despite the
much smaller space tested for it with our algorithm. Why does it seem that some fundamental constants
have more RFs compared to others? More generally, which fundamental constants can even be expressed
with polynomial GCFs? Could there be constants for which RFs don’t exist at all (also in Section 4)? It
is intriguing that the novel research method we propose with the Ramanujan Machine not only finds new
conjectures about RFs of fundamental constants, but also about the intrinsic mathematical structures.
A new conjecture about the mathematical structure of GCFs that emerged from this research and
that we successfully proved concerns the rate of convergence of a GCF as a function of the degrees of
the
α, β
polynomials. We observed and later proved that when
deg(β)
deg(α)
>
2, then the convergence is
8
always polynomial in the GCF depth. When the ratio is smaller than 2, then the convergence is always
super-exponential. When the ratio is precisely 2, then the convergence can be exponential, depending on
more subtle conditions (see Appendix Section 3 for details). This result allowed us to further improve
MITM-RF algorithm.
We propose a systematic way of generating a space of candidate RF conjectures, generalizing beyond
the examples that we explored above. To establish new candidate mathematical conjectures, we envision
harvesting the scientific literature (e.g., arXiv.org containing over 1.5M papers) and generalizing RFs
with machine learning algorithms such as clustering methods. The rich dataset available online should
provide a strong ground truth for candidate RFs, which can be explored using algorithms similar to the
ones described in this work. Such approach may discover many new mathematical conjectures that go far
beyond GCFs and can be explored in a future work.
5.2 Applications
New RF conjectures could have intriguing applications. Fast converging GCFs and other identities are
being utilized for efficient calculation of different constants, for example, one of the most efficient methods
to compute
π
is based on a formula by Ramanujan [
24
]. More generally, new RFs could help us calculate
other constants faster, like the super-exponential convergence that was demonstrated above for
e
. Another
potential application of new RFs is for proving intrinsic properties of fundamental constants. An example
is Apéry’s proof that
ζ
(3) is irrational, done by representing it as a GCF [
25
], which led to similar proofs
for other constants.
5.3 The Universality of Fundamental Constants
Field Name Decimal Expansion
Related to Continued Fractions Lévy’s constant γ = 3.275822 . . .
Khinchin’s constant K
0
= 2.685452 . . .
Physics First Feigenbaum constant δ = 4.669201 . . .
Second Feigenbaum constant α = 2.502907 . . .
Laplace Limit λ = 0.662743 . . .
Number Theory Twin Prime constant Π
2
= 0.660161 . . .
Meissel Mertens constant M = 0.261497 . . .
Landau–Ramanujan constant Λ = 0.764223 . . .
Cominatorics Euler–Mascheroni constant γ = 0.577215 . . .
Catalan’s constant G = 0.915965 . . .
. . . . . . . . .
Table 2:
A sample of fundamental constants from different fields, which are all relevant targets for our
method, a wider list is available in [
1
] and [
5
]. For all of these, new RF conjectures will point to deep
underlying connections. There are thousands more constants for which enough numerical data exists
and our method is applicable. With further improvement on our suggested approaches, along with new
algorithms provided by the community, we expect that such expressions will be found. Note, that some
constants in the table like the Feigenbaum constants have no analytical expression what-so-ever, and so
far can only be computed using numerical simulation. Therefore, having a RF for them will reveal a
hidden truth not only about the constant, but also about the entire field to which it relates.
We have so far only provided the groundwork for a far more comprehensive study into fundamental
constants and their underlying mathematical structure. With our proposed algorithms and their extensions,
we were able to find RFs for the constants
π
,
e
, and
ζ
(3). Table 2 presents a selection of additional
fundamental constants of particular interest to our approach. For part of them, e.g. Feigenbaum constants,
9
no RF is known
. We also list a few examples of constants with intrinsic connections to the theory of
GCF. Potentially the most interesting constants for further research are the ones coming from other fields,
like number theory (not so ironically, some of them are also named after Ramanujan) and various fields of
physics. With such constants, any new RF can point to a new hidden connection between fields of science.
With further improvements and new algorithms, applied on the thousands of fundamental constants in
the literature, we expect many new RFs to be found.
References
[1]
Steven R Finch and Jet Wimp. Reviews-mathematical constants. Mathematical Intelligencer,
26(2):70–73, 2004.
[2]
Jonathan Borwein and David Bailey. Mathematics by experiment: Plausible reasoning in the 21st
century. AK Peters/CRC Press, 2008.
[3] Niels Bohr. Rydberg’s discovery of the spectral laws. 1954.
[4]
Goro Shimura. Modular forms of half integral weight. In Modular Functions of One Variable I, pages
57–74. Springer, 1973.
[5] Wolfram. Mathworld, 2019.
[6]
Joseph Frederick Scott. The mathematical work of John Wallis (1616-1703). Taylor and Francis,
1938.
[7]
Thomas J Pickett and Ann Coleman. Another continued fraction for
π
. The American Mathematical
Monthly, 115(10):930–933, 2008.
[8]
Dawei Lu, Lixin Song, and Yang Yu. Some new continued fraction approximation of euler’s constant.
Journal of Number Theory, 147:69–80, 2015.
[9]
Andrew Wiles. Modular elliptic curves and fermat’s last theorem. Annals of mathematics, 141(3):443–
551, 1995.
[10] Leonhard Euler. Introductio in analysin infinitorum, volume 2. MM Bousquet, 1748.
[11]
Marko Petkovšek, Herbert S Wilf, and Doron Zeilberger. A= b, ak peters ltd. Wellesley, MA, 30,
1996.
[12]
Kenneth I Appel and Wolfgang Haken. Every planar map is four colorable, volume 98. American
Mathematical Soc., 1989.
[13]
Warwick Tucker. The lorenz attractor exists. Comptes Rendus de l’Académie des Sciences-Series
I-Mathematics, 328(12):1197–1202, 1999.
[14] Thomas C Hales. A proof of the kepler conjecture. Annals of mathematics, pages 1065–1185, 2005.
[15]
David H Bailey, Jonathan M Borwein, and Roland Girgensohn. Experimental evaluation of euler
sums. Experimental Mathematics, 3(1):17–30, 1994.
[16]
William YC Chen, Qing-Hu Hou, and Doron Zeilberger. Automated discovery and proof of congruence
theorems for partial sums of combinatorial sequences. Journal of Difference Equations and Applications,
22(6):780–788, 2016.
[17]
Kshitij Bansal, Sarah M Loos, Markus N Rabe, Christian Szegedy, and Stewart Wilcox. Holist: An
environment for machine learning of higher-order theorem proving (extended version). arXiv preprint
arXiv:1904.03241, 2019.
10
[18]
Steve Smale. Mathematical problems for the next century. The mathematical intelligencer, 20(2):7–15,
1998.
[19]
Godfrey H Hardy, John E Littlewood, et al. Some problems of ‘partitio numerorum’; iii: On the
expression of a number as a sum of primes. Acta Mathematica, 44:1–70, 1923.
[20]
John Tate. On the conjectures of birch and swinnerton-dyer and a geometric analog. Séminaire
Bourbaki, 9(306):415–440, 1965.
[21] Edmund Landau. Vorlesungen über zahlentheorie. I. Leipzig, 1927.
[22] Bruce C Berndt. Ramanujan’s notebooks. Springer Science & Business Media, 2012.
[23]
William B Jones and WJ Thron. Survey of continued fraction methods of solving moment problems
and related topics. In Analytic theory of continued fractions, pages 4–37. Springer, 1982.
[24]
Jonathan M Borwein, Peter B Borwein, and David H Bailey. Ramanujan, modular equations, and
approximations to pi or how to compute one billion digits of pi. The American Mathematical Monthly,
96(3):201–219, 1989.
[25] Roger Apéry. Irrationalité de ζ (2) et ζ (3). Astérisque, 61(11-13):1, 1979.
11
A Additional Results by the MITM-RF Algorithm
In this section we show a sample of generalized continued fractions (GCFs) that were all found by
our MITM-RF algorithm, and are listed here in addition to the ones presented in the main text. Our
MITM-RFs algorithm was able to reproduce previously known and proven results, along with new RFs
which we present here as conjectures.
Convergence Known / New Formula Polynomials
Super-exponential new
1
e1
= 1 +
1
1+
1
1+
3
1+...
a
n
1
= 1, b
n
1
= 1
a
n
2
= 1, b
n
2
= n
known e 1 = 1 +
2
2+
3
3+
4
4+...
a
n
= 1 + n, b
n
= 1 + n
Exponential new
4
π
= 1 +
1·(32·1)
1+3·1+
2·(32·2)
1+3·2+
3·(32·3)
1+3·3+...
a
n
= 1 + 3n, b
n
= n(3 2n)
new
2
π
= 1 +
1·(12·1)
1+3·1+
2·(12·2)
1+3·2+
3·(12·3)
1+3·3+...
a
n
= 1 + 3n, b
n
= n(1 2n)
new
2
π+4
= 1 +
1·(32·1)
13·1+
2·(32·2)
13·2+
3·(32·3)
13·3+...
a
n
= 1 3n, b
n
= n(3 2n)
new
2
π+2
= 2 +
1·(12·1)
2+3+
2·(12·2)
2+3·2+
3·(12·3)
2+3·3+...
a
n
= 2 + 3n, b
n
= n(1 2n)
known
4
π
= 1 +
1
2
3+
2
2
5+
3
2
7+...
a
n
= 1 + 2n, b
n
= n
2
Polynomial known
4
π
= 1 +
1
2
2+
3
2
2+
5
2
2+...
a
n
= 2, b
n
= (2n 1)
2
known π + 3 = 6 +
1
2
6+
3
2
6+
5
2
6+...
a
n
= 6, b
n
= (2n 1)
2
known
2
2π
= 3
2·3
1
1·2
3
4·5
1...
a
n
1
= 3, b
n
1
= 2n(2n + 1)
a
n
2
= 1, b
n
2
= n(n 1)
known
6
π
2
6
= 1 +
1
2
1+
1·2
1+
2
2
1+...
a
n
1
= 1, b
n
1
=
(n + 1)
2
2
a
n
2
= 1, b
n
2
=
n
n
2
+ 1
2
Table 3: (Conjectures 5-9)
Sample of automatically generated conjectures for mathematical formulas
of fundamental constants, as generated by our proposed Ramanujan Machine by applying the Meet-In-
The-Middle Regular Formula (MITM-RF) algorithm. We mark which results are previously unknown
conjectures to the best of our knowledge. For each RF we provide the convergence rates and polynomials.
B Structured MITM-RF
From previously known representations for Apéry’s constant as infinite sums, and by deriving GCFs from
infinite sums using Euler’s continued fraction identity, we found formulas for Apéry’s constant, and noted
that they are commonly constructed with high degree polynomials as partial numerators and partial
denominators (with the
b
n
polynomial having double the degree of the
a
n
polynomial). Yet, they can be
transformed into sparse polynomials. Then, by enumerating only on sparse integer polynomials in the
MITM algorithm, we were able to find the RFs for Apéry’s constant (Table 4).
12
Formula Polynomials
1
ζ(3)
= 0
3
+ 1
3
1
6
1
3
+2
3
2
2
2
3
+3
3
3
6
3
3
+4
3
...
a
n
= n
3
+ (n + 1)
3
b
n
= n
6
5
2ζ(3)
= 2 +
2·1
5
·1
2+1·3·7+
2·2
5
·3
2+1·4·10+
2·3
5
·5
2+1·5·13+...
a
n
= 2 + n(2 + n)(4 + 3n)
b
n
= 2n
5
(2n 1)
Table 4:
Two Apéry GCF and their polynomials, matching formulas 1 & 2 in the main text. The top
converges with polynomial rate, and the bottom converges with exponential rate.
C GCF Convergence Rate
The method detailed in Section 3 requires estimating the expected accuracy from finite approximation of
GCFs. In this section we characterize the convergence rate of the GCFs, as well as a trick that improves
this convergence rate for the exponential case.
For two sets of numbers
{a
n
}
n=0
, {b
n
}
n=1
we define the generalized continued fraction (GCF)
generated by them as
[a
0
; (b
1
, a
1
) , (b
2
, a
2
) , . . .]
:
= a
0
+
b
1
a
1
+
b
2
a
2
+...
and the partial GCF as
η
n
:
= [a
0
; (b
1
, a
1
) , (b
2
, a
2
) , . . . , (b
n
, a
n
)]
if the limit exists, we define:
η
:
= lim
n→∞
η
n
We also define the tail:
τ
n
:
= [a
n
; (b
n+1
, a
n+1
) , (b
n+2
, a
n+2
) , . . .]
From there it follows that:
η =
p(τ
n
)
q(τ
n
)
where
p, q
are polynomials of degree
n
1 whose coefficients depend on
{a
i
}
n1
i=0
, {b
i
}
n
i=1
. Specifically, for
a
i
, b
i
Z we have p, q Z[x].
It was shown (Jones & Thron, 1982) that the partial GCF
η
n
can be computed as a series of
Matrix-Vector multiplications:
p
0
p
1
:
=
a
0
1
q
0
q
1
:
=
1
0
p
n+1
q
n+1
p
n
q
n
:
=
a
n
b
n
1 0
p
n
q
n
p
n1
q
n1
p
n+1
= a
n
p
n
+ b
n
p
n1
q
n+1
= a
n
q
n
+ b
n
q
n1
13
From which η
n
can be calculated like so:
η
n
=
p
n
q
n
In the following sections we discuss GCFs with integer polynomials for a
n
and b
n
:
a (x) , b (x) Z [x]
a
n
= a (n)
b
n
= b (n)
which we will abbreviate as GCFPs.
C.1 GCFP Error Bound
Taking the determinant of the above linear equation, we can deduce the following expression for the
matrix determinant:
p
n+1
q
n
q
n+1
p
n
= (1)
n
n
Y
i=1
b
i
η
n+1
η
n
= (1)
n
Q
n
i=1
b
i
q
n+1
q
n
If b
i
has constant sign for all i > k from some k, we get a the following Leibniz series:
X
i=k
η
n+1
η
n
Since:
η
n+1
= η
k
+
X
i=k
η
i+1
η
i
= η
k
+
X
i=k
(1)
n
Q
n
i=1
b
i
q
n+1
q
n
the following relation is achieved:
n k η
2n+κ
lim
n→∞
η
2n+κ
lim
n→∞
η
2n+κ1
η
2n+κ1
where κ {0, 1}, depending on the sign of
Q
k
i=1
b
i
and k mod 2. Hence we get that η and
|η η
n
| ≤
Q
n
i=1
b
i
q
n+1
q
n
C.2 1-Periodic GCF
A GCFP is called k-periodic if n N a
n
= a
n+k
, b
n
= b
n+k
. A 1-periodic GCFP is one of the form:
a +
b
a +
b
a+...
hence for (w
n
) = (p
n
) or (w
n
) = (q
n
) :
w
n+1
w
n
:
=
a b
1 0
| {z }
promoter
matrix
w
n
w
n1
14
For a
2
> 4b, we find that the promoter matrix is real-diagonalizable
eigvals
a b
1 0
=
(
a ±
a
2
+ 4b
2
)
= {λ
±
}
eigvecs
a b
1 0
=

λ
±
1

= {v
±
}
w
0
w
1
= κ
+
v
+
+ κ
v
w
n
w
n1
= λ
n
+
κ
+
v
+
+ λ
n
κ
v
And from there, the decomposition for p, q is:
p
0
p
1
=
a
0
1
=
λ
+
a
2
+ 4b
v
+
λ
a
2
+ 4b
v
q
0
q
1
=
1
0
=
v
+
v
a
2
4b
Thus
p
n1
q
n1
=
1
a
2
+4b
λ
n+1
+
λ
n+1
1
a
2
+4b
λ
n
+
λ
n
=
λ
n+1
+
λ
n+1
λ
n
+
λ
n
For a > 0 we get that |λ
+
| > |λ
| 0, hence:
p
n1
q
n1
= λ
+
1
λ
λ
+
n+1
1
λ
λ
+
n
lim
n→∞
η
n
= λ
+
While in the case a < 0 we have |λ
| > |λ
+
| 0, which in turn results in:
lim
n→∞
η
n
= λ
Note that if we knew
lim
n→∞
η
n
then:
η
=
a
+
b
η
, yielding a quadratic equation with the same
results.
C.3 Types Of Convergence
Not every continued fraction converges. In the case it does, its rate of convergence is either: exponential,
super-exponential, or sub-exponential (which seems to be at a polynomial rate, however it is yet to be
proven). When the continued fraction does not converge, it may oscillate between a set of values or
“converge” to a discrete oscillating cycle, meaning that for a
k
-oscillation with values
{o
i
}
k1
i=0
, we have
lim
n→∞
|η
n
o
n
| = 0.
15
In the following parts, we analyze the GCFP behaviour with regard to its defining polynomials
a, b
.
We’ll use the following notation:
d
a
:
= deg (a)
d
b
:
= deg (b)
a (x) =
d
a
X
j=1
α
j
x
j
b (x) =
d
b
X
i=0
β
j
x
i
For an easier analysis of the GCFP behavior, we use the equivalence transformation and define its
semi-canonical form as
3
:
n N c
n
:
=
b
n
a
n1
a
n
a
0
1 +
b
1
a
0
a
1
1 +
b
2
a
1
a
2
1+...
=
:
[a
0
; (c
1
, c
2
, . . .)]
From there it follows:
η
n
= [a
0
; (c
1
, . . . , c
n
)]
Unless stated otherwise we’ll regard only the main part of the above GCFP:
1 +
b
1
a
0
a
1
1 +
b
2
a
1
a
2
1+...
= 1 +
c
1
1 +
c
2
1+...
We now recognize 3 distinct cases. In the first case, denoted as the exponential case we have:
d
b
= 2d
a
lim
n→∞
c
n
=
β
d
b
α
2
d
a
The second case, denoted as the super-exponential case we have:
d
b
< 2d
a
lim
n→∞
c
n
= 0
And finally, the third case, denoted as the sub-exponential or the polynomial case we have:
d
b
> 2d
a
lim
n→∞
c
n
= sign (β
d
b
) ·
For all cases, from some point,
c
n
β
d
b
α
2
d
a
n
d
b
2d
a
, meaning that
sign (c
n
)
=
sign (β
d
b
)
. Thus
|q
n
|
=
|q
n1
+ c
n
q
n2
| is monotonically increasing.
Based on the observation that
η
is a rational function of
τ
n
, it’s enough to show that the above claims
for the convergence rate apply for the tail τ
n
for some n.
3
This is well defined, as we’re examining the tail’s behavior. Therefore neglect n’s for which a
n
= 0, as they are finite.
16
C.3.1 Exponential
From some point
c
n
β
d
b
α
2
d
a
=
:
c
, therfore
q
n
increases as a generalized Fibonacci series. Specifically, it
does not change sign. Therefore we can now refer to the 1-periodic case, as equivalent results can be
derived here similarly to Section C.2, since:
q
n+1
q
n
=
1 c
1 0
nk
q
k
q
k1
So with the same condition on the determinant, which here translate to:
4c + 1 > 0
m
4β
d
b
> α
2
d
a
We get that the GCFP converges to
λ
+
(WLOG, we assume that
|λ
+
| > |λ
|.
The
λ
case is similar),
and therefore:
η
n
κλ
+
1
λ
λ
+
n+1
1
λ
λ
+
n
where κ is a constant arising from the point k at which we assume c
i
c. We then receive:
|η η
n
| ≈ κλ
+
1
1
λ
λ
+
n+1
1
λ
λ
+
n
= κλ
+
λ
λ
+
n
1 +
λ
λ
+
1
λ
λ
+
n
κλ
+
1 +
λ
λ
+
λ
λ
+
n
Since |λ
| < |λ
+
| we get an exponentially decreasing value, hence:
|η η
n
|
Q
n
i=1
c
i
|q
n+1
q
n
|
κ
3
4c
4c + 2 + 2
4c + 1
(nk)
|η η
n
| κ
3
exp (κ
4
n)
Notice that this result is dependent on
κ
, which makes it an upper bound for the error rather then
the exact error (who’s calculation is equivalent to the calculation of
η
). With a more careful calculation
the exponent parameters can be found to get a tighter bound on the error. In sake of brevity, these
calculations are omitted.
C.3.2 Super-exponential
Again we assume WLOG that β
d
b
> 0. Threfore for all n k for some k :
c
n
β
d
b
α
2
d
a
n
<0
z }| {
d
b
2d
a
17
Also, since n > k : c
n
> 0. From there we get that n > k : q
n
> q
k
. Which in turn results in:
|η η
n
| ≤
Q
n
i=1
c
i
q
n+1
q
n
Q
n
i=1
c
i
q
2
k
κ
1
β
d
b
α
2
d
a
nk
n!
k!
d
b
2d
a
= κ
2
β
d
b
α
2
d
a
nk
(n!)
2d
a
d
b
Since
exp(n)
n!
is decreasing super-exponentially, the desired result is obtained.
C.3.3 Sub-Exponential
The case satisfying the determinant constraint 4
β
d
b
> α
2
d
a
can be seen as a limit of the exponential
convergence case with
c
, therefore the derived convergence is sub-exponential. We believe this
sub-exponential convergence to be polynomial.
C.3.4 Tail Estimation
In the case of an exponentially converging GCFP, we found that from some point the tail is approximately:
1 +
c
1 +
c
1+...
We calculated the convergence value of 1-periodic GCFs like this earlier. Therefore, we can improve a
GCFP calculation by substituting this tail at the final step. The accuracy improvement wasn’t analyzed,
but empiric results display an improvement of a fixed number of digits (for any large
n
). This in turn
allowed us to improve the complexity of the MITM-RF algorithm.
D Collaborative Algorithm-Enhanced Mathematics
The Ramanujan Machine in its most general sense can be seen as a methodology to generate conjectures on
fundamental constants. The more computational power and the more time the algorithm runs on a selected
space of parameters, the more conjectures it may generate. Moreover, since the Ramanujan Machine
produces conjectures on fundamental constants but not their proofs, we realize that computational power
as well as proving power (i.e. time spent by an intelligent being trying to prove or refute a conjecture) are
key assets for making the Ramanujan Machine more prolific. It is the goal of this section to discuss how
one may leverage these facts about the Ramanujan Machine methodology to inspire the wider community
about mathematics and number theory.
We created the Ramanujan Machine as an open source project that is fully available to the community
on
www.RamanujanMachine.com
. Soon, with our ongoing development, individuals around the world
would be able to donate their computational power to the mission of discovering new mathematical
structures and mathematical equations by downloading the Ramanujan Machine screen saver on the
website. Similarly to SETI (Search for Extraterrestrial Intelligence), we plan to have the Ramanujan
Machine screen saver distribute via BOINC the various computational tasks to every computer in the
network, so when a computer is idle, the Ramanujan Machine is initiated.
We believe this methodology can inspire the greater community about mathematics. In order to
achieve this goal, the site
www.RamanujanMachine.com
includes an up-to-date record of some (and in a
short time every) conjecture generated by the machine. When a specific computer in the network discovers
a new conjecture, the owner of the laptop will receive the credit for contributing his or her computer
power to discover the conjecture and the credit is maintained in a leadership board. Since the Ramanujan
Machine is a conjecture-generating machine (similarly to much of the work of Ramanujan himself), we let
18
the community suggest proofs for each conjecture, thus honorably claiming affiliation to Ramanujan’s
legacy, and introducing an algorithm-enhanced approach for collaborative research.
It is important to emphasize that the methodology introduced in this work can be expanded far beyond
continued fractions, number theory or mathematics. The Ramanujan Machine is an example of a broader
methodology that has three core elements in its pipeline, as shown in the main text (Fig. 1).
E Further Information about the Descent&Repel Method and
Results
This section provides an additional example of the Descent&Repel optimization process (in Fig. 5), in
addition to providing Table 5 with further information about the process presented in the main text (in
Fig. 4). The parameters chosen for Fig. 4 illustrate the optimization steps relatively clearly, however
without converging to any real solution. In Fig. 5, we present a similar illustration (Fig. 5), presenting
the convergence to e = 3 +
1
4+
2
5+
3
6+...
.
Figure 5:
Descent&Repel illustration, as in Fig. 4. Here the showcased scenario is of that of the
restoration of our previous result (new, found by the MITM-GCF algorithm)
e
= 3 +
1
4+
2
5+
3
6+...
. The
converging point is the one at (x, y) = (4, 1).
Below are the parameters required to reproduce the results in Fig. 4 and Fig. 5.
19
Fig. Number Parameters Values
Fig. 4 a(n) n
b(n) n
2
+ ny + x
x range [20, 20]
y range [20, 20]
fraction depth 10
constant π
initial points 600, uniform at y [15, 15] and x = 10
Fig. 5 a(n) n + x
b(n) n + y
x range [30, 20]
y range [40, 10]
fraction depth 20
constant e 3
initial points 500, uniform at y = 20 and x [10, 5]
Table 5:
Execution settings required to reproduce the Descent&Repel maps (Fig. 4, Fig. 5). Here,
a, b
are similar to the polynomials α, β which define the GCF, but the RF is of the form:
b
0
a
0
+
b
1
a
1
+...
.
20

Discussion

One can read more about generalized continued fractions, including Euler’s work on them, on this Wiki page: [Generalized continued fraction](https://www.wikiwand.com/en/Generalized_continued_fraction). Euler's equation is often called the most beautiful equation in mathematics, as it relates what are probably the five most ubiquitous numbers in mathematics: e, pi, the imaginary number i, 1 (the multiplication identity) and 0 (the additive identity). It turns out that the Golden ratio can also be written as another infinite sequence of 1’s, namely $\varphi = \sqrt{1+\sqrt{1+\sqrt{1+\sqrt{1+\dots}}}}$. Gradient descent (a.k.a steepest descent) is one of the best-known methods to find a minimum or a maximum of a function. The algorithm travels along the gradient of a function (direction of steepest slope) in order to reach its local extremum. To see the hydrogen spectral lines Rydberg was observing, see this link: [Hydrogen Spectral Series](https://www.wikiwand.com/en/Hydrogen_spectral_series). Fermat’s last theorem (a.k.a Fermat’s conjecture) states that no three positive integers $a$, $b$ and $c$ can satisfy the equation $a^n+b^n=c^n$ for any integer $c>2$. It was scribbled as a side note in one of Fermat’s notebooks and to this day it is unknown whether Fermat just conjectured it or actually proved it himself, although most mathematicians believe the former is true. After 358 years of many efforts by mathematicians, Andrew Wiles proved it by a tour de force that involves elliptic curves and modularity.

 Oh yeah, Fermat’s library is named after the same Fermat who discovered this famous theorem! =)
 Read more about this algorithm here: [Meet in the middle](https://www.geeksforgeeks.org/meet-in-the-middle/) Since these are new results (conjectures) if you are able to prove any of them that would be a new proof of a mathematical identity, some of which are as aesthetic and beautiful as Euler’s continued fractions! How can one resist?
 =) This is probably one of the most important comments in this work and what makes it fundamentally unique. In this work, the machine is discovering new mathematics, not just proving it!
 Another cool example of such a collaborative project is SETI, search for extraterrestrial intelligence, which monitors electromagnetic radiation for signs of transmissions from civilizations on other planets. The idea here is that just like mathematicians study a lot of formulas and mathematical theories and start developing intuition to what may hold true or not, perhaps by leveraging machine learning computer algorithms can also use the existing knowledge base to discover new mathematics. This figure shows one of the most powerful outcomes of the result. The formulas discovered by the Ramanujan Machine converge extremely fast to most constants. This means they provide a better way to calculate the constants than most traditional methods. Apery’s constant is the value of the Riemann zeta function at zeta=3. It is considered to be of fundamental importance because it arises naturally in physics, specifically in Quantum Electrodynamics (QED), solid state physics and of the Debye model and blackbody radiation. Super-exponential means that it grows even faster than an exponential function. There are a lot of algorithms that can prove mathematical identities once they are discovered. This paper is not trying to prove formulas, instead, it finds new mathematical conjectures and relationship that were previously unknown to mathematicians. It can be shown that the digits of an irrational number do not terminate and do not repeat. In other words, they do not contain an infinitely repeating pattern in them. There are so many important constants in mathematics and other fields of science and the codebase available for this project can help find many new formulas. You can find the code on [RamanujanMachine.com](https://www.RamanujanMachine.com). We highly encourage you to run it yourself, so you can discover new math! The Riemann hypothesis (a.k.a Riemann conjecture) is probably considered the most famous open problem in mathematics. The Riemann conjecture states that all the zeros of the Riemann Zeta function has its zeros only at the negative even integers and complex numbers with real part $\frac{1}{2}$. If it's indeed true, as most believe, it would have great ramifications on number theory (as the Riemann Zeta function is related to prime numbers). The Clay Mathematic Institute offers a $1,000,000 prize to whoever solves it. [Riemann Hypothesis](https://www.claymath.org/millennium-problems/riemann-hypothesis) We believe that sometimes the road can be just as important and illuminating as the destination. The accumulating work to prove the Riemann Hypothesis resulted in numerous new findings in Number Theory alongside progress towards the hypothesis itself. Same goes here, providing proofs for the results shown in our paper may discover hidden truths about the fundamental constants themselves in addition to solving an open conjecture. There are many representations of $e$ in this link: [List of representations of e](https://www.wikiwand.com/en/List_of_representations_of_e
) Other well-known continued fractions are available in this link: [Continued fraction](https://www.wikiwand.com/en/Continued_fraction#/Continued_fraction_expansions_of_%CF%80)