In 1989 Tversky and Gilovich published a paper entitled [“The Cold ...
#### Bernoulli trials In probability and statistics a *Bernoulli t...
### Analyzing over 600k free throws between 2006 and 2016 I foun...
### Simpson’s Paradox The **Simpson’s paradox** is a statistical p...
#### Hypergeometric distribution Discrete probability distribution...
I again used the 2006-2016 NBA free throw dataset that I mentioned ...
$Z$ represents the number of standard deviations ($\sqrt{var(a)}$) ...
If we recompute these tables for the data referenced in the previou...
A very interesting recent article on this (with some great referenc...
If once again we look at our new data set (as mentioned in previous...
Simpson’s Paradox and the Hot Hand in Basketball
Robert
L.
WARDROP
A number of psychologists and statisticians are interested
in how laypersons make judgments in the face of uncer-
tainties, assess the likelihood of coincidences, and draw
conclusions from observation. This is an important and
exciting area that has produced a number of interesting
articles. This article uses an extended example to demon-
strate that researchers need to use care when examining
what laypersons believe. In particular, it is argued that the
data available to laypersons may be very different from
the data available to professional researchers. In addition,
laypersons unfamiliar with a counterintuitive result, such
as Simpson’s paradox, may give the wrong interpretation
to the pattern in their data. This paper gives two recom-
mendations to researchers and teachers. First, take care
to consider what data are available to laypersons. Sec-
ond, it is important to make the public aware of Simpson’s
paradox and other counterintuitive results.
KEY WORDS: Hot hand phenomenon; McNemar’s test;
Multiple analyses; Simpson’s paradox.
1. INTRODUCTION
Schoolchildren routinely learn to identify optical illu-
sions. It is arguably as important that the general public
learn to identify statistical illusions. Many outstand-
ing researchers have addressed this issue. As examples,
Diaconis and Mosteller (1989) investigate computing the
probabilities of coincidences; Kahneman, Slovic, and
Tversky (1983) consider judgments made in the presence
of uncertainty; and Tversky and Gilovich (1989) inves-
tigate the popular belief in the hot hand phenomenon in
basketball. This article examines some of the data pre-
sented by Tversky and Gilovich.
Suppose that a basketball player plans to attempt
20
shots, with each shot resulting in a hit or a miss. A statis-
tician might assume tentatively that the assumptions of
Bernoulli trials are appropriate for this experiment. Sup-
pose next that the experiment
is
performed and the player
obtains the following data:
HMHMM MHHHM HHHMM HMHHH
Do
these data provide convincing evidence against the
tentative assumption of Bernoulli trials? Are the three
occurrences of three successive hits convincing evidence
of the player having a “hot hand”? These are difficult
questions to answer because of the myriad of possible
alternatives to Bernoulli trials that exist. It is mathemat-
ically and conceptually convenient to restrict attention to
alternatives that allow the probability of success on any
trial to depend on the outcome
of
the previous trial or,
perhaps, the outcomes of some small number of previous
Robert
L.
Wardrop is Associate Professor, Department of Statistics,
University of Wisconsin-Madison, Madison,
WI
53706.
The author
thanks the referees and associate editor for helpful comments.
24
The
American Statistician, February 1995,
Vol.
49,
No.
1
trials. (This restriction may be unrealistic, but that issue
will not be addressed in this article.) With the restrictive
class of alternatives described here, Tversky and Gilovich
devised a clever experiment to obtain convincing evidence
that knowledgeable basketball fans are much too ready to
detect occurrences of streak shooting-the hot hand-in
sequences that are, in fact, the outcomes of Bernoulli trials.
Having established that basketball fans detect the hot
hand in simulated random data, Tversky and Gilovich
next examined three sets of real data. The data sets are:
shots from the field during National Basketball Associa-
tion (NBA) games; pairs of free throws shot during NBA
games; and a controlled experiment using college varsity
men and women basketball players. Using the restrictive
alternatives described above, Tversky and Gilovich found
no evidence of the hot hand phenomenon in any of their
data sets. In addition, using a test statistic that is sensitive
to certain time trends in the probability of success, they
again found no evidence of the hot hand phenomenon.
This article examines the free throw data presented by
Tversky and Gilovich. Tversky and Gilovich began by
asking a sample of 100 “avid basketball fans” from Cornell
and Stanford: “When shooting free throws, does a player
have a better chance of making his second shot after mak-
ing his Erst shot than after missing his first shot?” A
“Yes” response was interpreted as indicating belief in the
existence of the hot hand phenomenon, and a “No” as
indicating disbelief. (Actually, a
“No”
response com-
bines persons who believe in independence with those who
believe in a negative association between shots; but the
researchers apparently were not interested in separating
these groups.) Sixty-eight of the fans responded “Yes” and
the other 32 “No.” Thus, a large majority of those quest-
ioned believed in the hot hand phenomenon for free throw
shooting. Tversky and Gilovich investigated the above
question empirically by examining data they obtained on
a small group
of
well-known and widely viewed basket-
ball players, namely, nine regulars on the 1980-1981 and
1981-1982 Boston Celtics basketball team.
After their analysis of the Celtics data, Tversky and
Gilovich concluded that “These data provide no evidence
that the outcome of the second shot depends on the out-
come of the first.” Section 2 of this article will examine the
Celtics data with the goal of reconciling what Tversky and
Gilovich found and what their basketball fans believed,
In particular, it will be shown that, in a certain sense, the
prevalent fan belief in the hot hand is not necessarily at
odds with Tversky and Gilovich’s conclusion.
The analysis presented in Section 3 of this paper indi-
cates that several Celtics players were better at their second
shots than at their first.
2.
INDEPENDENCE
It is instructive to begin by considering just two
of
the
nine Boston Celtics players who are represented in the free
throw data, namely, Larry Bird and Rick Robey. During
@
1995 American Statistical Association
Table
1.
Observed Frequencies for Pairs
of
Free Throws by Larry Bird and Rick Robex and the Collapsed Table
Larry Bird Rick Robey Collapsed Table
Second: Second: Second:
First: Hit Miss Total First: Hit Miss Total First: Hit Miss Total
Hit 25 1
34 285
Hit
54 37 91
Hit
305 71
376
Miss
48
5
53 Miss
49
31 80
Miss
97 36 133
Total 299
39 338 Total
103 68 171
Total
402 107 509
the 1980-1981 and 1981-1982 seasons, Larry Bird shot
a pair of free throws on 338 occasions. Five times he
missed both shots, 25 1 times he made both shots, 34 times
he made only the first shot, and 48 times he made only
the second shot. These data are presented in Table 1, as
are the same data for Rick Robey. Let&t and
pi,,
denote
the proportion of first shot hits that are followed by a hit
and the proportion of first shot misses that are followed by
a hit, respectively. For Bird,
&it
=
251/285
=
.881 and
piss
=
48/53
=
.906. For Robey, these numbers are .593
and .612, respectively. Note that, contrary to the hot hand
theory, each player shot slightly better after a miss than
after a hit, although, as shown below, the differences are
not statistically significant.
It is possible, of course, to ignore the identity of the
player attempting the shots and examine the data in the
collapsed table in Table
1.
For
example, on 509 occasions
either Bird or Robey attempted two free throws, on 305
of those occasions both shots were hit, and
so
on. For the
collapsed table,
&it
=
A11 and
pi,,
=
.729. These values
support the hot hand theory-a hit was much more likely
than a miss to be followed by a hit.
The data from Bird and Robey illustrate Simpson’s para-
dox (Simpson 1951), namely,
&it
<
pdss
in each compo-
nent table, but
&it
>
piss
in the collapsed table.
For
further examples and discussion of Simpson’s paradox,
see Shapiro (1982), Wagner (1982), the essay by Alan
Agresti in Kotz and Johnson (1983), and their references.
Figure
1
provides a visual explanation of Simpson’s
paradox. The top picture in the figure presents the propor-
tion of second-shot successes after a hit for Bird, Robey
and the collapsed table. The bottom picture in the figure
presents the same three proportions for second shots at-
tempted after a miss. It is easy to verify algebraically that
the proportion of successes for a collapsed table equals
the weighted average of the individual player’s propor-
tions, with weights equal to the proportion of data in
the collapsed table that comes from the player. For the
after-a-hit condition, for example, the weight for Bird is
285/376
=
.758, the weight for Robey is 91/376
=
.242,
and the proportion of successes for the collapsed table,
305/376
=
.811, is
285 251
91
54
376 285
376 91
In Figure
1,
the heights of the four rectangles above the
Bird and Robey proportions equal the weights associated
with the relevant player-condition pair. For example, the
height of the rectangle for Bird in the after-a-hit condition
x
-+-
x
-.
-
equals .758, in agreement with the computation of the
previous paragraph. Thus, the proportion of successes for
each collapsed table in the figure is located at the center of
gravity
of
the two rectangles.
As
a result, even though both
Bird and Robey shot better after a miss than after a hit, the
collapsed values show the reverse pattern due to the huge
variation in weights associated with each player. In short,
Simpson’s paradox has occurred because the after-a-miss
condition, when compared to the after-a-hit condition, has
a disproportionately large share of its data originating from
the far inferior shooter Robey.
When
I
first examined the Bird and Robey data several
years ago, my immediate reactions were that this is an
interesting example of Simpson’s paradox, the analysis
of individual tables is “correct,” and the analysis of the
collapsed table is “incorrect.” Now
I
believe these labels
were applied too hastily. The reasons
I
changed my mind
are discussed below after the entire data set is examined.
Table 2 introduces symbols to represent the various
numbers in a
2
x
2 table. The values
111,122,
rnl,
and
m2
denote the marginal totals, and the values of
a,
b,
c,
and
d
denote the cell counts. The null hypothesis states that
the outcome of the second shot is statistically independent
of the outcome of the first shot. If the null hypothesis is
true, then conditional on the values of the marginal totals,
the cell count
a
has a hypergeometric distribution with
weight
:::
-j
After
a
Hit:
0.6
0.4
0.2
,811
weight
1
After
a
Miss:
Robey
0.6
n
,729
Figure
1.
A
Visual Explanation of Simpson’s Paradox for the Free
Throw Study.
The American Statistician, February 1995,
Vol.
49,
No.
I
25
Table
2.
Standard Notation for a
2
x
2
Table
~~
Second:
First: Hit Miss Total
Hit
a b nl
Miss
C
d n2
Total
ml m2 n
expectation and variance:
and
The null distribution of
u
-
E(u)
z=
dkii(q
(3)
can be approximated by the standard normal curve. For
Larry Bird,
a
=
251,E(a)
=
252.12, and var(a)
=
4.575.
Substituting these values into Equation
(3)
gives
251
-
252.12
=
-.52.
drn
Z=
Thus, as stated earlier, the results are not statistically sig-
nificant. For Robey,
z
=
-
.25, and for the collapsed table,
z
=
1.99. Thus, an analysis of the collapsed table alone
would lead one to conclude that there is statistically sig-
nificant evidence in support of the hot hand theory.
Tversky and Gilovich report data for all nine men
who played regularly for the Celtics during 1980-1982.
The summaries needed for analysis are given in Table
3.
The first column of the table lists the players’ names. The
second and third columns list the values of
&it
and
pmSs
defined above. The fourth, fifth, and sixth columns list
the values of
a,
E(a),
and var(a) which are obtained from
their data and Equations (1) and (2).
The seventh col-
umn lists the value
of
z
from Equation (3) for each player.
The men are listed in the table by decreasing values of
&it
-
F~iss
which, not too surprisingly, also lists them by
decreasing values of
z.
Thus, McHale, with a difference
of 73
-
59
=
14 percentage points, is listed first and Carr,
with a difference of 68
-
81
=
-13 percentage points, is
listed last. In terms
of
either the point estimates or the test
statistic value, McHale provides the strongest evidence
in support of the hot hand theory, and Carr provides the
strongest evidence in support of an inverse relationship
between the outcomes of the two shots. Note that four
players-McHale, Maxwell, Parish, and Archibald-shot
better after a hit, while the remaining five players shot
better after a miss.
The data for McHale give a one-sided approximate
P
value of .0418. This is not particularly noteworthy for two
reasons:
(1) It is difficult to justify the use of a one-sided alterna-
tive, especially given that five players shot better after
a miss and four shot better after a hit.
Table
3.
Selected Statistics for the Investigation of Independence
of Shots for Nine Members of the Boston Celtics
Player
h
Phif
Kevin McHale
Cedric Maxwell
Robert Parish
Nate Archibald
Rick Robey
Gerald Henderson
Larry Bird
Chris Ford
M.
L.
Carr
.73
.a1
.77
.83
.59
.76
.88
.71
.68
-
h
Pmiss
.59
.76
.72
.82
.61
.78
.91
.77
.a1
-
-
a
93
245
164
203
54
77
25 1
36
39
fa)
88.23
240.20
160.75
202.26
54.81
77.58
252.12
37.03
41.20
var(a)
7.633
14.667
13.061
8.380
10.257
4.858
4.575
3.100
3.620
Z
1.73
1.25
.90
.26
-
.25
-.26
-.52
-.58
-1.16
(2) Even if one believes a one-sided alternative is appro-
priate, on the assumption that all nine players have
independence between shots, the approximate proba-
bility is
1
-
(1
-
.0418)’
=
.32,
or about one-third,
that at least one of the nine
P
values would be as small
or smaller than McHale’s.
Table 4 presents the observed frequencies and row pro-
portions for the free throw data collapsed over the nine
Celtics under investigation. For the collapsed table, the
relative frequency of a hit after a hit is 78.9
-
74.3
=
4.6
percentage points higher than the relative frequency of
a hit after a miss. Moreover, for the collapsed table, it
can be shown that
a
=
1,162,E(a)
=
1,143.03, and
var(a)
=
72.015, yielding
z
=
2.24, which is statistically
significant.
To summarize, separate analyses of individual players
indicate that four players shot better after a hit and five
players shot better after a miss, but none of the individual
player patterns is convincing. By contrast, the analysis
of
the collapsed table gives statisticdIy significant evidence
in support of the hot hand phenomenon.
In view of the Celtics data, what, if anything, are we
to make of the fact that 68 out of 100 of Tversky and
Gilovich’s avid basketball fans believe in the hot hand
phenomenon for free throw shooting? Perhaps these fans
have been watching players who do exhibit the hot hand.
Perhaps these fans see patterns in data where no patterns
exist. I prefer the following explanation.
I am an avid basketball fan.
Over the past
30
years,
I
have observed several thousand different players shoot-
ing free throws. It is difficult to imagine that I (or any
other basketball fan) could remember the equivalent of
thousands of 2
x
2 tables. Yet these individual tables are
exactly what
I
would need in order to investigate prop-
erly the question of the hot hand phenomenon.
It
is
much
more reasonable to assume that I have a single 2
x
2
table
Table
4.
Observed Frequencies and Row Proportions for Free
Throw Data Collapsed Over Nine Celtics
Second: Second:
First: Hit Miss Total First: Hit Miss Total
Hit
1,162 311 1,473
Hit
.789 .211 1.000
Miss
428 148 576
Miss
.743 .257 1.000
Total
1,590 459 2,049
26
The American Statistician, February
1995,
Vol.
49,
No.
1
in my mind, namely, the collapsed table for all players
I
have seen. Just like the Celtics data, my collapsed ta-
ble indicates that a success is more likely than a failure
to be followed by a success. Thus, there
is
a pattern in
the data that are reasonably available to me and, I conjec-
ture, in the data that are reasonably available to Gilovich
and Tversky’s
100
basketball fans. It seems reasonable
to suggest to basketball fans that the mental equivalent of
Simpson’s paradox could lead to a cognitive statistical il-
lusion that results in their “seeing patterns in the data that
do not exist.”
3.
STATIONARITY
Tversky and Gilovich correctly concluded that there is
no evidence of the hot hand phenomenon in the free throw
data. In this section, it is demonstrated, however, that the
simple model of Bernoulli trials is
also
inappropriate. In
particular, it
is
shown that several of the Celtics players
shot significantly better on their second free throw, perhaps
as a result of the practice afforded by the first shot.
Look at Table
1
again. Larry Bird made 84.3% (285 of
338) of his first shots compared to 88.5% (299 of 338)
of his second shots. Thus, there is evidence that he im-
proved on his second shot. The null hypothesis that his
probability of success was constant can be investigated
with McNemar’s test, which uses the fact that the null
distribution of
(4)
b-c
2,
=
___
can be approximated by the standard normal curve. (Re-
call that
b
and
c
are defined in Table 2.) For Larry Bird,
b
=
34 and
c
=
48, giving
Jb?-c
34
-
48
21
=
~~
=
-1.55.
The same analysis can be performed for the other eight
Celtics; the results are given in Table 5. The first col-
umn of the table lists the player’s names. The second and
third columns list, respectively, the relative frequencies
of successes on the first and second shots. The remain-
ing columns list the values of
b
and
c
from each player’s
2
x
2 table and the value of
z1
computed from Equation
(4). The players are listed according to the difference in
relative frequencies between the first and second shots.
Table
5.
Selected Statistics for Comparing the Success Rates on
the First and Second Free Throws for Nine Members of Boston
Celtics
Player
as,)
i;cS2,
b
c
Zl
Cedric Maxwell
.TO
.ao
57 97
-
3.22
Robert Parish
.67 .75 49 76
-
2.41
Nate Archibald
.76 .83 42 62
-
1.96
Rick Robey
.53 .60 37 49
-
1.29
Larry Bird
.84
.aa
34 48
-
1.55
Gerald Henderson
.73 .77 24 29 -.69
Chris Ford
.70
.73 15 17
-
.35
Kevin McHale
.72 .69 35 29 .75
Total
-
M.
L. Carr
.a
.72
ia
21
-
.4a
-
311
428
z2=
-
4.30
Thus, Maxwell, who shot ten percentage points better on
the second shot than on the first, is listed first, and McHale,
who shot three percentage points better on the first shot,
is listed last. Note the following features of the data.
(1)
Eight of nine players had a higher success rate on
their second shots.
(2) Three players had one-sided approximate
P
val-
ues below
.05:
Maxwell
(.0006),
Parish
(.0080),
and
Archibald (.0250). The interpretation of these
P
values
should take into account that nine tests were performed.
If, in fact, each player had a constant success rate on
his two shots, the approximate probability of obtaining
at least one
P
value equal to or smaller than
.0006
is:
1
-
(1
-
.0006)9
=
.0054.
Similarly, the approximate
probability of obtaining at least two
P
values equal to
or smaller than
.0080
is .0022. Finally, the approximate
probability of obtaining at least three
P
values equal to or
smaller than .0250 is .0012. Thus, the three statistically
significant results do not seem to be attributable to the
execution of many tests.
(3) McNemar’s test can be viewed as testing that a
Bernoulli trial success probability equals
.5
based on a
sample of size
b
+
c.
Thus, several of the analyses
of
in-
dividual pIayers presented in Table
5
are based on very
little data and, hence, have very low power. To combat
this difficulty, it is instructive to combine the data across
the nine players. In particular, if the null hypothesis of
constant success probability is true for all nine players,
then the observed value of
where the
sum
is taken over the nine tables, can be viewed
as an observation from a distribution that is approximately
the standard normal curve. The observed value of
Z,
is
-4.30, given in the bottom row of Table 5. This value
indicates that there is overwhelming evidence against the
assumption that all nine null hypotheses are true.
4.
SUMMARY
This article puts forth an argument to reconcile what
avid basketball fans believe and what Tversky and
Gilovich found. It is argued that the fans and the re-
searchers were analyzing different sets of data. While
the researcher’s data had no pattern, the fan’s data had
a pattern. This pattern, however, was due to the effects
of aggregation and not the hot hand phenomenon. This
finding indicates that researchers should take care to con-
sider what data are available to laypersons. In addition,
this finding underscores the importance of increasing the
awareness of statistical fallacies among the general public.
This article also demonstrates that several Celtics play-
ers showed a significant improvement in their shooting
ability on the second free throw. Thus, while the hot hand
phenomenon is not supported by these free throw data,
neither is the simple model of Bernoulli trials.
[Received March
1992.
Revised November
1993.1
The American Statistician, February
1995,
Vol.
49,
No.
I
27
REFERENCES
Diaconis,
P.,
and Mosteller,
F.
(1989), “Methods for Studying Coinci-
dences,”
Journal of the American Statistical Association,
84, 853-
861.
Kahneman, D., Slovic,
P.,
and Tversky, A. (1983),
Judgement Under
Uncertainty: Heuristics and Biases,
Cambridge,
U.K.:
Cambridge
University Press.
Kotz,
S.,
and Johnson, N. L. (eds.) (1983),
Encyclopedia of Statistical
Science
(Vol. 3), New York John Wiley, pp. 24-28.
Shapiro,
S.
H.
(1982), “Collapsing
a
Contingency Table-A Geometric
Approach,”
The American Statistician,
36,4346.
Simpson,
E.
H. (1951), “The Interpretation of Interaction in Contin-
gency Tables,”
Journal qf the Royal Statistical Society,
Ser.
B,
13,
Tversky, A,, and Gilovich,
T.
(1989), “The Cold Facts About the
‘Hot
Hand’ in Basketball,”
CHANCE: New Directions ,for Statistics and
Computing,
2, 16-21.
Wagner, C. H. (1982), “Simpson’s Paradox in Real Life,”
The American
Statistician,
36,4647.
23 8-241.
28
The American Statistician, February
1995,
Vol.
49,
No.
I

Discussion

If we recompute these tables for the data referenced in the previous annotations we get: $$ \begin{array}{|c|c|c|c|} \hline & Hit & Miss & Total \\ \hline Hit & 3447 & 970 & 4417 \\ \hline Miss & 1213 & 745 & 1958 \\ \hline Total & 4660 & 1715 & 6375 \\ \hline \end{array} $$ $$ \begin{array}{|c|c|c|c|} \hline & Hit & Miss & Total \\ \hline Hit & 0.780 & 0.220 & 1.000 \\ \hline Miss & 0.620 & 0.380 & 1.000 \\ \hline \end{array} $$ #### Bernoulli trials In probability and statistics a *Bernoulli trial* is an experiment that has exactly two possible outcomes (e.g. Flipping a coin ) and the probability of each outcome is the same every time the experiment is performed. $Z$ represents the number of standard deviations ($\sqrt{var(a)}$) a given value is from the mean. A very interesting recent article on this (with some great references and discussion) is from ESPN earlier this year. http://www.espn.com/nba/story/_/page/presents-19573519/heating-fire-klay-thompson-truth-hot-hand-nba The key paper referenced is by Miller and Sanjuro (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2611987) which finds significant evidence of the hot hand in NBA three-point contests. In 1989 Tversky and Gilovich published a paper entitled [“The Cold Facts About the “Hot Hand” in Basketball”](http://www.medicine.mcgill.ca/epidemiology/hanley/c323/hothand.pdf). In this paper they investigated the relatively common belief in the “Hot Hand” phenomenon. This term refers to the perceived tendency for success (and failure) in basketball to be self promoting or self sustaining (e.g. a player being more likely to score their next field goal if they scored the previous one). In the end, they concluded that contrary to common belief, a player’s chances of making a field goal are largely independent of the outcome of their previous shots. ### Analyzing over 600k free throws between 2006 and 2016 I found a [public dataset on kaggle of over 600k NBA free throws](https://www.kaggle.com/drgilermo/shooting-percent-over-time/data) from games played between 2006 and 2016. [Here](https://drive.google.com/file/d/19m8GJv-qW2h9CHT584RDWtSEhqwFsINE/view) is a direct link to the data. Below is the collapsed table of the observed frequencies of pairs of free throws for all players in this dataset: $$ \begin{array}{|c|c|c|c|} \hline & Hit & Miss & Total \\ \hline Hit & 153064 & 39633 & 192697 \\ \hline Miss & 50938 & 18769 & 69707 \\ \hline Total & 204002 & 58402 & 262404 \\ \hline \end{array} $$ Like in the paper, the first column holds the outcome of the first free throw and the first row holds the outcome of the second free throw. As we can see, the pattern originally observed in the paper holds in this new dataset. You might be wondering why there are only 262404 pairs of free throws. The reason is because we are only looking at pairs of sequential free throws (we are ignoring for instance single free throws). Below are the observed frequencies for pairs of free throws by player (these are the top 20 players sorted by frequency of Hit, Hit pairs - HH). You can find the data for all players [here](https://gist.github.com/joaobatalha/7757e5079e4b0d837f4e8787e1de4cd3) $$ \begin{array}{|c|c|c|c|} \hline Player & HH & HM & MH & MM & Total \\ \hline LeBron\ James & 1854 & 513 & 729 & 220 & 3316 \\ \hline Kevin\ Durant & 1842 & 231 & 284 & 37 & 2394 \\ \hline Kobe\ Bryant & 1591 & 264 & 329 & 50 & 2234 \\ \hline Dirk\ Nowitzki & 1563 & 125 & 212 & 20 & 1920 \\ \hline Carmelo\ Anthony & 1496 & 281 & 365 & 68 & 2210 \\ \hline Dwyane\ Wade & 1414 & 357 & 470 & 150 & 2391 \\ \hline Russell\ Westbrook & 1363 & 257 & 321 & 68 & 2009 \\ \hline James\ Harden & 1356 & 199 & 239 & 43 & 1837 \\ \hline Chris\ Bosh & 1332 & 293 & 333 & 66 & 2024 \\ \hline Paul\ Pierce & 1283 & 196 & 316 & 54 & 1849 \\ \hline Kevin\ Martin & 1208 & 141 & 206 & 18 & 1573 \\ \hline Chris\ Paul & 1182 & 143 & 212 & 32 & 1569 \\ \hline Dwight\ Howard & 1136 & 753 & 854 & 736 & 3479 \\ \hline Pau\ Gasol & 1037 & 265 & 346 & 81 & 1729 \\ \hline Deron\ Williams & 1014 & 174 & 257 & 48 & 1493 \\ \hline LaMarcus\ Aldridge & 984 & 219 & 220 & 72 & 1495 \\ \hline DeMar\ DeRozan & 961 & 183 & 210 & 50 & 1404 \\ \hline Tim\ Duncan & 925 & 293 & 448 & 166 & 1832 \\ \hline Amare\ Stoudemire & 918 & 223 & 270 & 69 & 1480 \\ \hline Tony\ Parker & 903 & 211 & 301 & 82 & 1497 \\ \hline \end{array} $$ I again used the 2006-2016 NBA free throw dataset that I mentioned in a previous annotation to compute a similar table to *Table 3*. I looked at the players for which there were more than 200 pairs of free throws and listed the top 20 by decreasing values of $\widehat{p}_{hit} - \widehat{p}_{miss}$. In total our dataset had 359 players with more than 200 pairs of free throws. Out of those 359, 227 shot better after a hit ($\widehat{p}_{hit} - \widehat{p}_{miss} > 0$) and 132 shot better (or at the same rate) after a miss $\widehat{p}_{hit} - \widehat{p}_{miss} \leq 0$. [Here](https://gist.github.com/joaobatalha/cd0b8260c3fee43d00c1d637690af676) is the data for those 359 players. ![data](https://i.imgur.com/GwdRbh1.png) #### Hypergeometric distribution Discrete probability distribution that describes the probability of $k$ successes in $n$ draws, *without replacement*, from a population size $N$ that contains $K$ objects of the “success” feature. In contrast, a binomial distribution describes the exact same scenario, but using draws *with* replacement. *Example:* You have a bag with red and blue balls. There are 10 total balls (6 red and 4 blue). You are going to 3 balls out, one by one, and put them aside. The probability of getting $x$ red balls is a *Hypergeometric distribution*. If once again we look at our new data set (as mentioned in previous annotations) we find that out of 359 players, 328 shoot better on their second free throw ($\widehat{p}(S_{2}) > \widehat{p}(S_{1})$). $$ \begin{array}{|c|c|c|c|} \hline Player & \widehat{p}(S_{1}) & \widehat{p}(S_{2}) & b & c & z_{1} \\ \hline Avery\ Bradley & 0.7 & 0.83 & 25.0 & 53.0 & -3.17 \\ \hline Michael\ Gilchrist & 0.62 & 0.75 & 43.0 & 77.0 & -3.1 \\ \hline Ersan\ Ilyasova & 0.71 & 0.83 & 61.0 & 117.0 & -4.2 \\ \hline Marcus\ Morris & 0.63 & 0.75 & 42.0 & 77.0 & -3.21 \\ \hline Josh\ Boone & 0.38 & 0.49 & 40.0 & 64.0 & -2.35 \\ \hline Marquis\ Daniels & 0.65 & 0.76 & 37.0 & 62.0 & -2.51 \\ \hline Eddy\ Curry & 0.55 & 0.66 & 70.0 & 119.0 & -3.56 \\ \hline Bismack\ Biyombo & 0.51 & 0.62 & 73.0 & 114.0 & -3.0 \\ \hline DeShawn\ Stevenson & 0.66 & 0.77 & 41.0 & 73.0 & -3.0 \\ \hline Joe\ Smith & 0.72 & 0.83 & 28.0 & 56.0 & -3.06 \\ \hline Rajon\ Rondo & 0.57 & 0.67 & 161.0 & 242.0 & -4.03 \\ \hline Kyle\ Korver & 0.85 & 0.95 & 13.0 & 45.0 & -4.2 \\ \hline Kelenna\ Azubuike & 0.73 & 0.83 & 28.0 & 49.0 & -2.39 \\ \hline Jared\ Dudley & 0.68 & 0.78 & 66.0 & 117.0 & -3.77 \\ \hline J.R.\ Smith & 0.68 & 0.78 & 111.0 & 186.0 & -4.35 \\ \hline James\ Posey & 0.78 & 0.88 & 20.0 & 43.0 & -2.9 \\ \hline Ricky\ Davis & 0.76 & 0.86 & 24.0 & 46.0 & -2.63 \\ \hline Shaquille\ O'Neal & 0.47 & 0.57 & 144.0 & 217.0 & -3.84 \\ \hline Kenneth\ Faried & 0.61 & 0.71 & 95.0 & 149.0 & -3.46 \\ \hline James\ Johnson & 0.65 & 0.75 & 37.0 & 63.0 & -2.6 \\ \hline \end{array} $$ ### Simpson’s Paradox The **Simpson’s paradox** is a statistical paradox that essentially states that you can draw 2 (or more) different, arguably valid, conclusions from the same data depending on how you “divide” your data. In other words, you might observe a trend in several groups of data (belonging to a bigger dataset), but that same trend disappears (or reverses!) when you look at the dataset in aggregate. The following figure is a good illustration of the Simpson’s paradox: ![simpsons paradox](https://upload.wikimedia.org/wikipedia/commons/thumb/4/47/Simpson%27s_paradox_continuous.svg/390px-Simpson%27s_paradox_continuous.svg.png) In this figure you have a dataset where you can see a positive trend for two separate groups of data (red and blue) when you look at the groups of data individually, but when you combine both groups you actually get a negative trend (- - -).