[Samuel Pepys](https://en.wikipedia.org/wiki/Samuel_Pepys "samuel p...

Harvard Professor Joe Blitzstein introducing the Newton-Pepys probl...

Newton and Pepys exchanged a series of 6 letters in late 1693. Pepy...

Samuel Pepys was a gambling man, and the Newton-Pepys problem is re...

### The Problem
* A. Throwing 6 dice, and getting at least one 6.
...

Am I correct in saying that De Moivre's approximation here is just ...

Probability table of obtaining **n** or more 6 when **n dice** are ...

A binomial distribution with parameters $n$ and $p$ is the discrete...

Newton gave Pepys the correct answer to the problem, although his e...

There is a problem with this part of Newton's argument. He assumes ...

Statistical Science

2006, Vol. 21, No. 3, 400–403

DOI: 10.1214/088342306000000312

© Institute of Mathematical Statistics, 2006

Isaac Newton as a Probabilist

Stephen M. Stigler

Abstract. In 1693, Isaac Newton answered a query from Samuel Pepys

about a problem involving dice. Newton’s analysis is discussed and atten-

tion is drawn to an error he made.

On November 22, 1693, Samuel Pepys wrote a let-

ter to Isaac Newton posing a problem in probability.

Newton responded with three letters, ﬁrst answering

the question brieﬂy, and then offering more informa-

tion as Pepys pressed for clariﬁcation. Pepys (1633–

1703) is best known today for his posthumously pub-

lished diary covering the intimate details of his life over

the years 1660–1669, but Newton would not have been

aware of that diary. He would instead have known of

Pepys as a former Secretary of Admiralty Affairs who

had served as President of the Royal Society of Lon-

don from 1684 through November 30, 1686, the same

period when Newton’s great Principia was presented

to the Royal Society and its preparation for the press

begun. But Pepys’ letter did not concern scientiﬁc mat-

ters. He sought advice on the wisdom of a gamble.

1. PEPYS’ PROBLEM

The three letters Newton wrote to Pepys on this

problem, on November 26 and December 16 and 23,

1693, are almost all we have bearing on Newton and

probability. Some of the letters were published with

other private correspondence in Pepys (1825,Vol.2,

pages 129–135; 1876–1879, Vol. 6, pages 177–181)

and more completely in Pepys (1926,Vol.1,pages

72–94). The letters were cited in a textbook by Chrys-

tal (1889, page 563), where he gave Pepys’ problem as

an exercise, but they were little known until they were

brought to a wide public attention when selections

were reprinted with commentary independently by Dan

Pedoe (1958, pages 43–48), Florence David (1959;

1962, pages 125–129) and Emil D. Schell (1960).

These authors and several others, notably Chaundy and

Bullard (1960), Mosteller (1965, pages 6, 33–35) and

Stephen M. Stigler is Ernest Dewitt Burton Distinguished

Service Professor of Statistics, Department of Statistics,

University of Chicago, Chicago, Illinois 60637, USA

(e-mail: stigler@galton.uchicago.edu).

Gani (1982) have discussed the problem Pepys posed

and Newton’s solution. Others accorded it briefer no-

tice, including Sheynin (1971), who dismissively rele-

gated it to a footnote; Westfall (1980, pages 498–499),

who gave unwarranted credence to the excuse Pepys

opened his ﬁrst letter with, that the problem had some

connection to a state lottery; and Gjertsen (1986, pages

427–428). But none of these or any other writer seems

to have noted that a major portion of Newton’s solution

is wrong. The error casts an interesting light on how

Newton thought about the matter, and it seems useful

to revisit the question.

Since Pepys’ original statement was, as Newton no-

ticed, somewhat ambiguous, I will state the problem in

paraphrase as it emerged in the correspondence:

Which of the following three propositions has the

greatest chance of success?

A. Six fair dice are tossed independently and at least

one “6” appears.

B. Twelve fair dice are tossed independently and at

least two “6”s appear.

C. Eighteen fair dice are tossed independently and

at least three “6”s appear.

As it emerged in the correspondence, Pepys initially

thought that the third of these (C) was the most prob-

able, but when Newton convinced him after repeated

questioning by Pepys that in fact A was the most prob-

able, Pepys ended the correspondence and announced

he would, using Mosteller’s (1965, page 35) colorful

later term, welsh on a bet he had made.

2. NEWTON’S SOLUTION

Newton stated the solution three times during the

correspondence: ﬁrst he gave a simple logical reason

for concluding that A is the most probable, then he re-

ported a detailed exact enumeration of the chances in

each of the three cases, and ﬁnally he returned to the

logical argument and gave it in more detail.

400

ISAAC NEWTON AS A PROBABILIST 401

Newton’s exact enumeration was elegant and ﬂaw-

less; it is equivalent to the solution as might be pre-

sented in an elementary class today. Newton worked

from ﬁrst principles assuming no knowledge of the bi-

nomial distribution; we can now express what he found

by this calculation in terms of a random variable X

with a Binomial (N, p) distribution as follows:

A. P(X≥1) = 31031/46656 =0.665 when N =6

and p =1/6.

B. P(X ≥ 2) = 1346704211/2176782336 = 0.619

when N =12 and p = 1/6.

C. Here Newton simply stated that, “In the third

case the value will be found still less.”

In fact,

P(X≥3) = 60666401980916/101559956668416

= 0.597

when N = 18 and p = 1/6, as another of Pepys’ cor-

respondents (a Mr. George Tollet) found after much la-

bor, while trying to duplicate Newton’s results (Pepys,

1926, Vol. 1, pages 92–94).

Pepys had originally thought that C was the most

probable; Newton’s logical arguments and his careful

enumeration of chances pointed in the contrary direc-

tion. But while the conclusion Newton reached is cor-

rect, only the enumeration stands up under scrutiny. To

understand why, it will help to develop a heuristic un-

derstanding of why A is the most probable.

3. A HEURISTIC VIEW

Pepys’ problem amounts to a comparison of three

Binomial (N, p) distributions with p = 1/6, namely

those with N = 6, 12 and 18. He desired a ranking

of P(X ≥ Np) for the three cases. Now, in all Bino-

mial distributions where the mean Np is an integer, Np

is also the median of the distribution (and indeed the

mode as well). This is always true, surprisingly even

in cases like those under study here, where the dis-

tributions are quite skewed and asymmetric. This is a

byproduct of a proof that for any N and any p, the dif-

ference between the mean and median of a binomial

distribution is strictly less than ln(2)<0.7 (Hamza,

1995). So when the mean Np is an integer the two

must agree, and this implies in particular that in all

these cases,

P(X≥Np) ≥

1

2

and P(X≤ Np) ≥

1

2

,

and so in each case P(X ≥ Np) exceeds 1/2bya

fraction of the probability P(X = Np).Infact,inthe

cases Pepys considered we have to a fair approxima-

tion P(X≥ Np) ≈ 1/2 +(0.4)P (X = Np). The rank-

ing Newton calculated then reﬂects the fact that the

size of the modal probability for a binomial distribu-

tion, P(X = Np), decreases as N increases and the

distribution spreads out, p being held constant. In-

deed, as De Moivre would ﬁnd by the 1730s, P(X =

Np) is well approximated by 1/

√

(2πNp(1 −p)) ≈

1.07/

√

N when p = 1/6. So in particular, the proba-

bilities in A, B, C are about 1/2 + (0.4)(1.07)/

√

N,

an approximation that would give values 0.67, 0.62,

0.60, which agree with the exact values to two places.

Chaundy and Bullard (1960) provide a cumbersome

rigorous proof that this sequence is decreasing, in some

generality.

Note that this approximation depends crucially upon

the probabilities P(X ≥ 1), P(X ≥ 2) and P(X ≥ 3)

of A, B, C being P(X ≥ Np) [i.e. P(X ≥ E(X))]

for the three respective distributions, and the result de-

pends upon this as well. Franklin B. Evans observed

this sensitivity already in 1961, ﬁnding, for example,

that P(X ≥ 1|N = 6,p = 1/4) = 0.8220 <P(X≥

2|N = 12,p = 1/4) = 0.8416 (Evans, 1961). That is,

the ordering of A and B that Newton found for fair dice

can fail for weighted dice, and indeed will tend to fail

when p is sufﬁciently greater than 1/6, even though

they be tossed fairly and independently.

4. NEWTON’S LOGICAL ARGUMENT

In his ﬁrst letter to Pepys on November 26, 1693,

Newton had been content to give a short logical argu-

ment for why the chance of A must be the largest. He

dissected the problem carefully, and made it clear that

the proposition required that in each case at least the

given number of “6”s should be thrown. Newton then

restated the question and gave an apparently clear argu-

ment as to why the chance for A had to be the largest:

“What is the expectation or hope of A to

throw every time one six at least with six

dyes?

“What is the expectation or hope of B to

throw every time two sixes at least with

twelve dyes?

“What is the expectation or hope of C to

throw every time three sixes at least with 18

dyes?

“And whether has not B and C as great an

expectation or hope to hit every time what

they throw for as A hath to hit his what he

throws for?

402 S. STIGLER

“If the question be thus stated, it appears

by an easy computation that the expectation

of A is greater than that of B or C; that is,

the task of A is the easiest. And the reason

is because A has all the chances of sixes

on his dyes for his expectation, but B and

C have not all the chances on theirs. For

when B throws a single six or C but one or

two sixes, they miss of their expectations.”

(Pepys, 1926, Vol. 1, 75–76; Schell, 1960)

Newton’s conclusion was of course correct but the

argument is not. It is easy for us to see that it cannot

work because the argument applies equally well for

weighted dice, and as we now know, the conclusion

fails if, for example, p is 1/4. Any correct argument

must explicitly use the fact that 1, 2, 3 are the expec-

tations for A, B, C, and Newton’s does not. His enu-

meration did do so, but A would equally well have “all

the chances of sixes on his dyes” even if the chance of

a“6”is1/4. Newton’s proof refers only to the sample

space and makes no use of the probabilities of different

outcomes other than that the dice are thrown indepen-

dently, and so it must fail. But Newton does casually

use the word “expectations”; might he not have had

something deeper in mind? His subsequent correspon-

dence conﬁrms that he did not.

In his third letter of December 23, 1693, Newton re-

turned to this argument and expanded slightly on it.

He personiﬁed the choices by naming the player faced

with bet A “Peter” and the player faced with bet B

“James.” He then considered a “throw” to be six dice

tossed at once, so then Peter was to make (at least) one

“6” in a throw, while James was to make (at least) two

“6”s in two throws.

Newton then wrote, “As the wager is stated, Peter

must win as often as he throws a six [i.e., makes at

least one “6” among the six dice], but James may of-

ten throw a six and yet win nothing, because he can

never win upon one six alone. If Peter ﬂings a six (for

instance) four times in eight throws, he must certainly

win four times, but James upon equal luck may throw

a six eight times in sixteen throws and yet win nothing.

For as the question in the wager is stated, he wins not

upon every single throw with a six as Peter doth, but

only upon every two throws wherein he throws at least

two sixes. And therefore if he ﬂings but one six in the

two ﬁrst throws, and one in the two next, and but one

in the two next, and so on to sixteen throws, he wins

nothing at all, though he throws a six twice as often as

Peter doth, and by consequence have equal luck with

Peter upon the dyes.” (Pepys, 1926, Vol. 1, page 89;

Schell, 1960)

Here we can see more clearly how Newton was led

astray: Even though in the ﬁrst letter he had care-

fully pointed out that “throwing a six” must be read as

“throwing at least one six,” here he confused the two

statements. His argument might work if “exactly one

six” were understood, but then it would not correspond

to the problem as he and Pepys had agreed it should be

understood. Indeed, Peter will not necessarily register

againwithevery“6”:ifhehastwoormoreintheﬁrst

“throw” of six dice, he wins the same as with just one.

Newton reduced the problem to single “throws” where

each throw is a Binomial (N = 6,p = 1/6), and he lost

sight of the multiplicity of outcomes that could lead to

a win. Many of Peter’s wins (those with at least two

“6”s, which occurs in about 40% of the wins) would be

wins for James as well. And in some of James’s wins

(those with at least two “6”s in one-half of tosses and

none in the other half, about 28% of James’s wins) Pe-

ter would not have done so well on “equal luck” (he

would have won but half the time). Evidently to make

Newton’s argument correct would take as much work

as an enumeration!

5. CONCLUSION

Newton’s logical argument failed, but modern prob-

abilists should admire the spirit of the attempt. It was a

simple appeal to dominance, a claim that all sequences

of outcomes will favor Peter at least as often as they

will favor James. It had to fail because the truth of the

proposition depends upon the probability measure as-

signed to the sequences and the argument did not. But

this was 1693, when probability was in its infancy.

Why has apparently no one commented upon this

error before? There are several possible explanations,

and no doubt each held for at least one reader. (1) The

letters were read superﬁcially, with no attempt to parse

the somewhat archaic language of the logical proof,

which after all points in the right direction. (2) The

language was puzzling and unclear to the reader (and

Newton was not available to ask), but it was accepted

since he was, after all, Isaac Newton, and the calcu-

lation clearly showed he was sound on the important

fundamentals. (3) The reader may even have seen that

it was not a satisfactory argument, but drew back from

accusing Newton of error, particularly since he got the

numbers right.

In a sense the argument is more interesting be-

cause it is wrong. Newton was thinking like a great

ISAAC NEWTON AS A PROBABILIST 403

probabilist—attempting a “eureka” proof that made the

issue clear in a ﬂash. When successful, this is the high-

est form of mathematical art. That it failed is no em-

barrassment; a simple argument can be wonderful, but

it can also create an illusion of understanding when the

matter is, as here, deeper than it appears on the surface.

If Newton fooled himself, he evidently took with him

a succession of readers more than 250 years later. Yet

even they should feel no embarrassment. As Augus-

tus De Morgan once wrote, “Everyone makes errors in

probabilities, at times, and big ones.” (Graves, 1889,

page 459)

REFERENCES

CHAUNDY,T.W.andBULLARD, J. E. (1960). John Smith’s prob-

lem. Mathematical Gazette 44 253–260.

C

HRYSTAL, G. (1889). Algebra; An Elementary Text-Book for the

Higher Classes of Secondary Schools and for Colleges 2.Adam

and Charles Black, Edinburgh.

D

AV I D , F. N. (1959). Mr Newton, Mr Pepys & Dyse [sic]: A his-

torical note. Ann. Sci. 13 137–147. (This is the volume for the

year 1957; this third issue, while nominally dated September

1957, was published April 1959, as stated in the volume Table

of Contents.)

D

AV I D , F. N. (1962). Games, Gods and Gambling. Grifﬁn, Lon-

don.

E

VANS, F. B. (1961). Pepys, Newton, and Bernoulli trials. Reader

observations on recent discussions, in the series Questions and

answers. Amer. Statist. 15 (1) 29.

G

ANI, J. (1982). Newton on “a question touching ye different odds

upon certain given chances upon dice.” Math. Sci. 7 61–66.

MR0642167

G

JERTSEN, D. (1986). The Newton Handbook. Routledge and

Kegan Paul, London.

G

RAVES, R. P. (1889). Life of Sir William Rowan Hamilton 3.

Hodges, Figgis, Dublin. Reprinted 1975 by Arno Press, New

York .

H

AMZA, K. (1995). The smallest uniform upper bound on the

distance between the mean and the median of the binomial

and Poisson distributions. Statist. Probab. Lett. 23 21–25.

MR1333373

M

OSTELLER, F. (1965). Fifty Challenging Problems in Probability

with Solutions. Addison–Wesley, Reading, MA. MR0397810

P

EDOE, D. (1958). The Gentle Art of Mathematics. Macmil-

lan, New York. (Reprints the ﬁrst two of Newton’s letters.)

MR0102468

P

EPYS, S. (1825). Memoirs of Samuel Pepys, Esq. FRS 1, 2.Henry

Colburn, London. (Reprints the ﬁrst of Pepys’ letters and two of

Newton’s replies.)

P

EPYS, S. (1876–1879). Diary and Correspondence of Samuel

Pepys, Esq. F.R.S 1–6. Bickers, London. (Reprints the ﬁrst of

Pepys’ letters and two of Newton’s replies.)

P

EPYS, S. (1926). Private Correspondence and Miscellaneous Pa-

pers of Samuel Pepys 1679–1703 in the Possession of J. Pepys

Cockerell 1, 2. G. Bell and Sons, London. [This is the fullest

reprinting. The portion of this correspondence directly with

Newton is fully reprinted in Turnbull (1961) 293–303.]

S

CHELL, E. D. (1960). Samuel Pepys, Isaac Newton, and prob-

ability. Published as part of the series Questions and answers.

Amer. Statist. 14 (4) 27–30. [Schell’s article includes a reprint-

ing of the Newton-Pepys letters. Further comments by readers

appeared in Amer. Statist. 15 (1) 29–30.]

S

HEYNIN, O. B. (1971). Newton and the classical theory of prob-

ability. Archive for History of Exact Sciences 7 217–243.

T

URNBULL, H. W., ed. (1961). The Correspondence of Isaac New-

ton 3: 1688–1694. Cambridge Univ. Press. MR0126329

W

ESTFALL, R. S. (1980). Never at Rest: A Biography of Isaac

Newton. Cambridge Univ. Press. MR0741027

### The Problem
* A. Throwing 6 dice, and getting at least one 6.
* B. Throwing 12 dice, and getting at least two 6.
* C. Throwing 18 dice, and getting at least three 6.
In the case of the combinatorial approach one computes the number of favorable outcomes and divide by the number of possible outcomes to get the probability of each scenario. Each dice has 6 faces so if you throw $N$ dice we have a total of $6^N$ possible outcomes. Instead of computing the probability of each event X we compute the complement probability. Throwing dice are independent events and one can write:
$\bar{A}$ is the event of not getting a 6 when we throw the 6 dice. $D_i$ the event of not throwing a six for a given throw i. For the first scenario we want:
\begin{eqnarray}
\bar{A} = D_1 \cap D_2 ... \cap D_6
\end{eqnarray}
These events are independent and thus one can write that the probability of not getting a six for i throws is:
\begin{eqnarray*}
P(\bar{A}) = \Pi \, P(D_i)
\end{eqnarray*}
The probability of *scenario A* then becomes:
\begin{eqnarray}
P(A) &=& 1 - P(\bar{A}) = 1 - \left(\frac{5}{6}\right)^6 \nonumber \\
&=& 0.665
\label{res1}
\end{eqnarray}
For scenarion B we do the same considerations as for scenario A. Let's denote $B_1$ the event of not getting any 6 and $B_2$ the event of getting exactly one 6.
\begin{eqnarray*}
\bar{B} = B_1 \cup B_2
\end{eqnarray*}
These are disjoinct events and we can then write that:
\begin{eqnarray}
P(B) = 1 - P(\bar{B}) = 1 - ( P(B_1) + P(B_2) )
\end{eqnarray}
We can compute the probability of $B_1$ from equation 1:
\begin{eqnarray*}
P(B_1) = \Pi \, P(D_i) = \left(\frac{5}{6}\right)^{12}
\end{eqnarray*}
For $B_2$ we need to compute the probability of getting one 6 and 11 non-6. We have 12 dice thus the probability of throwing one 6 is $12 1/6$. We have 11 dice left that cannot roll to a 6 and thus:
\begin{eqnarray*}
P(B_2) = 12 \left(\frac{1}{6}\right) \left(\frac{5}{6}\right)^{11}
\end{eqnarray*}
The probability of event B can be written:
\begin{eqnarray}
P(B) &=& 1 - P(\bar{B}) = 1 - \left( \left(\frac{5}{6}\right)^{12} + 2 \left(\frac{5}{6}\right)^{11} \right) \nonumber \\
&=& 1 - \frac{17}{6} \left(\frac{5}{6}\right)^{11} \\
&=& 0.619
\label{res2}
\end{eqnarray}
For scenarion C we consider the outcomes of B and we need to consider one additional type of outcome. Let's denote $C_3$ the event of getting exactly two 6.
\begin{eqnarray*}
\bar{C} = C_1 \cup C_2 \cup C_3
\end{eqnarray*}
These are disjoinct events and we can then write that:
\begin{eqnarray}
P(C) = 1 - P(\bar{C}) = 1 - ( P(C_1) + P(C_2) + P(C_3) )
\end{eqnarray}
For $C_1$ (no 6):
\begin{eqnarray*}
P(C_1) = \left(\frac{5}{6}\right)^{18}
\end{eqnarray*}
For $C_2$ (exactly one 6):
\begin{eqnarray*}
P(C_2) = 18 \left(\frac{1}{6}\right) \left(\frac{5}{6}\right)^{17}
\end{eqnarray*}
For $C_3$ we need to consider the events where we get exactly two 6. With 18 dice we have $\frac{18*17}{2}$ ways of getting exactly two sixes and we do not want to roll any 6 on the remaining 16 dice.
\begin{eqnarray*}
P(C_3) = \frac{18*17}{2} \left(\frac{1}{6}\right)^{2} \left(\frac{5}{6}\right)^{16}
\end{eqnarray*}
One can now write the total probability for scenarion C as:
\begin{eqnarray}
P(C) &=& 1 - \left( \left(\frac{5}{6}\right)^{18} + 3 \left(\frac{5}{6}\right)^{17} + 17/4 \left(\frac{5}{6}\right)^{16} \right) \nonumber \\
&=& 1 - \left( \left(\frac{5}{6}\right)^{18} + 3 \left(\frac{5}{6}\right)^{17} + 17/4 \left(\frac{5}{6}\right)^{16} \right) \nonumber \\
&=& 0.597
\label{res3}
\end{eqnarray}
From \ref{res1}, \ref{res2} and \ref{res3} we could easily determine that:
\begin{eqnarray*}
P(A) > P(B) > P(C)
\end{eqnarray*}
Am I correct in saying that De Moivre's approximation here is just a special case of the normal approximation to the binomial distribution? This would explain why $P(X=Np) = \frac{1}{2\pi\sigma}$.
Harvard Professor Joe Blitzstein introducing the Newton-Pepys problem (Harvard class Stats110):
[![Harvard Lecture](http://i.imgur.com/GHL4T5J.png)](https://www.youtube.com/watch?v=P7NE4WF8j-Q&feature=youtu.be&t=17m47s)
Newton and Pepys exchanged a series of 6 letters in late 1693. Pepys in London and Newton in Cambridge discussed a problem about gambling odds. Here are the links to the letters Pepys and Newton exchanged:
You can read the **transcripts of their correspondence** here (chronological order):
[Pepys to Newton - 11/22/1693](https://webspace.yale.edu/chem125/125/history99/2Pre1800/SPepysINewton/PepyNewtonPDFs/431P2N_112293.pdf)
[Newton to Pepys - 11/26/1693](https://webspace.yale.edu/chem125/125/history99/2Pre1800/SPepysINewton/PepyNewtonPDFs/432N2P_112693.pdf)
[Pepys to Newton - 12/9/1693](https://webspace.yale.edu/chem125/125/history99/2Pre1800/SPepysINewton/PepyNewtonPDFs/433P2N_12993.pdf)
[Newton to Pepys - 12/16/1693](https://webspace.yale.edu/chem125/125/history99/2Pre1800/SPepysINewton/PepyNewtonPDFs/434N2P_121693.pdf)
[Pepys to Newton - 12/21/1693](https://webspace.yale.edu/chem125/125/history99/2Pre1800/SPepysINewton/PepyNewtonPDFs/435P2N_122193.pdf)
@RobertAndrewMartin You are correct.
In this specific case de Moivre–Laplace theorem shows that the probability function of the number of "successes" observed in a series of n independent throws converges to the probability density function of the normal distribution.
You can learn more about it here: [De Moivre–Laplace theorem](https://en.wikipedia.org/wiki/De_Moivre–Laplace_theorem)
You write “In the case of the Newton-Pepys problem we need to find the probability $P(k)$ of throwing at least $k$ sixes with $6n$ dice,” and you go on to say:
\[
P(X=k)=\binom{6n}{k} \left( \frac{1}{6} \right)^k \left(\frac{5}{6}\right)^{6n−k}
\]
But this is the probability $P(k)$ of throwing _exactly_ $k$ sixes with $6n$ dice, whereas what _we_ want is $P(X \geq k)$ for $6k$ dice.
To sum these for all $k ~.. 6k$ would be tedious in the extreme and the solution given prior to this comment would be computationally preferable, for small $k$.
Probability table of obtaining **n** or more 6 when **n dice** are thrown. The **probability decreases with increasing n**.
![probability table](http://i.imgur.com/V9Ad2Wb.png "probability table")
[Samuel Pepys](https://en.wikipedia.org/wiki/Samuel_Pepys "samuel pepys"), was a famous diarist of 17th Century London. He kept a **his private diary from 1660 until 1669 which is one of the most important sources for the English Restoration period**.
Pepys was born in modest circumstances and educated as a scholarship student at Cambridge. He was a naturally curious person and undertook at the age of 29 to learn arithmetic. He found the multiplication tables particularly challenging, and used to wake up early and stay up late to study them.
During the restoration of the Stuart monarchy, **Pepys was one of the most influential men in England**. As Secretary of the Affairs of the Admiralty he presided over the recovery of the British Navy and helped make it dominant over those of France and the Netherlands, which had dealt Britain an ignominious naval defeat in 1667. **Despite his lack of science training as a student, he had natural curiosity and became quite interested in science.**
![Samuel Pepys](https://upload.wikimedia.org/wikipedia/commons/thumb/2/21/Samuel_Pepys.jpg/1024px-Samuel_Pepys.jpg "Samuel Pepys")
There is a problem with this part of Newton's argument. He assumes that you can only get one 6 per throw, which does not correspond to this problem. Assume that Peter and James throw groups of six dice. James does not need to throw the dice twice to win. He can win with the first throw by getting two or more 6. Here Newton makes the **wrong argument that you cannot get more than one 6 with one group of dice.**
@StevePowel Thanks for the heads up! Good point. I had a small typo in my equation. Just revised my previous annotation.
@Micael Thanks for the corrections. There is still a small slip in the equation, because it should say $P(X \geq n) = \cdots $ and not $P(X = k) = \cdots$. It would be good to just mention that you employ the complement strategy, too.
A binomial distribution with parameters $n$ and $p$ is the discrete probability distribution of the number of successes in **a sequence of n independent experiments**, with **success probability p**:
$$P(X=k)={\binom {n}{k}}p^{k}(1-p)^{n-k}$$
for k = 0, 1, 2, ..., n, where:
$${\binom {n}{k}}={\frac {n!}{k!(n-k)!}}$$
is the binomial coefficient.
In the case of the Newton-Pepys problem we need to find the probability **P(k) of throwing at least n sixes with 6n dice**. By applying the binomial distribution to this problem we can write:
$$P(X=k)=1 - \sum_{k=0}^{n-1} {\binom {6n}{k}}\left(\frac{1}{6}\right)^{k}\left(\frac{5}{6}\right)^{6n-k}$$
!["probability plot"](http://i.imgur.com/KBYA0hN.png)
Figure: Probability of rolling at least *k* six when we roll *6k* dice. The probability decreases as k increases.
Newton gave Pepys the correct answer to the problem, although his explanation was wrong. He imagined that A, B and C were tossing their dice in groups of six. For Newton A was most favorable because it required a 6 in only one toss, while B and C required a 6 in each of their tosses. This explanation wrongly assumes that a group of dice does not produce more than one 6, so it does not actually correspond to the original problem posed by Pepys. If Newton's explanation was correct then A would always be more probable independently of p (the probability of a dice rolling to 6). If you start playing with biased dice (changing $p$) Newton's explanation no longer works. The figure below shows the probability of A, B and C winning the game. The ranking is initially unchanged ($P(A)>P(B)>P(C)$). As you can see as the die bias increases, the ranking of the games inverts, and with a highly biased die the games where more dice are rolled are more likely to win ($P(C)>P(B)>P(A)$), thus Newton's explanation no longer holds true.
![Biased dice](http://www.datagenetics.com/blog/february12014/bias.png "biased dice")
Samuel Pepys was a gambling man, and the Newton-Pepys problem is related to a wager he planned to make (he planned to stake 10 pounds, the equivalent to about $1500 today on a similar bet). In the correspondence he exchanged with Isaac Newton he wanted to know which of the following three scenarios was the most probable:
* A. Throwing 6 dice and getting at least one six.
* B. Throwing 12 dice and getting at least two sixes.
* C. Throwing 18 dice and getting at least three sixes.
Initially **Pepys thought that C was the most probable** but Newton showed him that ** A is the most probable scenario**. We can easily show that Newton was right and A is in fact the most probable scenario. Today this seems like a trivial problem but the field of [probability theory](https://en.wikipedia.org/wiki/Probability_theory) was in it's infancy.