The choice of n
1
and n
2
is done by a form of exponential backoff algorithm. Most of our continued
fractions converge at least at a polynomial rate, with the slowest convergence rate thus being ϵ
n
∼ 1/n.
Such a convergence rate implies that some continued fractions will present the same incorrect digits over a
wide range of terms. This phenomenon makes it more challenging to distinguish digits that match the limit
from those that do not, and as a result makes evaluating K harder. To obtain a good approximation of
K, we want a significant difference between PCF
n
1
and PCF
n
2
, since this provides higher confidence in the
accuracy of the digits of PCF
n
1
that are still equal to PCF
n
2
, and therefore a tighter bound on K.
We sample the continued fraction at depths n
1
, n
2
, ..., choosing n
l+1
−n
l
∝ l, which matches the sampling
rate to the convergence rate in the slowest case of ϵ
n
∼ 1/n. This sampling rate makes the calculated decimal
approximation satisfies the requirement for substantial difference between subsequent PCF
n
l
, resulting in a
tight bound of K. As a result of this process, we attain a decimal approximation of the continued fraction’s
limit, along with the number of digits we refer to as “correct”.
Identifying over-fitted formulas discovered by PSLQ
To match a linear fractional transformation (i.e., a Mobïus transform) of a constant to a decimal
approximation, we utilize PSLQ to find integer coefficients ⃗a = (a
0
, a
1
, a
2
, a
3
) that satisfy the equation
a
0
+a
1
η
a
2
+a
3
η
= PCF
n
for a given constant η up to a given precision. Using the method described in the previous
section, we employ finite approximations of the continued fraction to evaluate the first K correct digits of
the limit. Given the approximation error of the continued fraction (∼ 10
−K
), we require PSLQ to find ⃗a
that satisfies the formula to the same accuracy.
The structure of linear fractional transformations is highly expressive. By using large enough coefficients
⃗a, the algorithm can always find a formula that matches any constant to any continued fraction up to any
finite precision. We should thus distinguish between true results and false-positives. A true result implies
that the formula will remain accurate when extended to more digits. A false-positive result arises from
over-fitting. For example, the formula
314157+e
100000
= π is correct up to 10
−5
, but is obviously false.
To distinguish over-fitted results, we consider that a decimal approximation using K digits can represent
10
K
different numbers. This amount sets a limit on the size of the PSLQ coefficients ⃗a. If the total number
of digits in the ⃗a found by PSLQ is also K, it indicates that the digits alone can already express 10
K
different numbers. Thus, the match is most-likely a coincidence, indicating a false-positive. Conversely, if
the collective number of digits in ⃗a is significantly smaller then K, it means that the result is more likely
true, as the probability for an accidental match is very small. Specifically, to identify false results, we use
the difference between the number of digits given by the decimal approximation K and the total number of
digits l in the vector ⃗a of PSLQ. A larger value of K − l implies high confidence in the formula (with the
chance of an accidental match scaling as ∼ 10
l−k
), while K ≈ l or K < l lead to disqualifying the conjecture.
This condition can be explained intuitively as expressing the same information of K digits using less
data. Unlike the decimal approximation, which conveys information solely through its digits, the formula
uses the mathematical constant for extra representation capability (added to the digits of ⃗a that define the
transformation). The mathematical constant thus enables a more efficient representation of the information.
In other words, true results can be understood as a means of data compression of the numerically computed
digits using the mathematical constant to provide a compact form. Consider for example the special case
where the limit of the continued fraction is exactly the constant itself. Then, the linear fractional trans-
formation does not need to provide any digits (⃗a = (0, 1, 1, 0)), hence we say that the representation is
maximally compressed - all the information is in the constant. The smaller the coefficients of the linear
fractional transformation, the closer it is to maximal compression.
C Selected families of continued fractions found by the Distributed
Factorial Reduction algorithm
In this section, we show a collection of infinite families of continued fractions discovered by the Distributed
Factorial Reduction algorithm. Some of the families presented in this section are used in Appendix D to
create conservative matrix fields for the corresponding constants.
24