my third recommendation is that, as re-
searchers, we routinely report effect sizes in the form of
confidence limits. "Everyone knows" that confidence in-
tervals contain all the information to be found in
icance tests and much more. They not only reveal the
status of the trivial nil hypothesis but also about the status
of non-nil null hypotheses and thus help remind re-
searchers about the possible operation of the crud factor.
Yet they are rarely to be found in the literature. I suspect
that the main reason they are not reported is that they
are so embarrassingly large! But their sheer size should
move us toward improving our measurement by seeking
to reduce the unreliable and invalid part of the variance
in our measures (as Student himself recommended almost
a century ago). Also, their width provides us with the
analogue of power analysis in significance testing—larger
sample sizes reduce the size of confidence intervals as
they increase the statistical power of NHST. A new pro-
gram covers confidence intervals for mean differences,
correlation, cross-tabulations (including odds ratios and
relative risks), and survival analysis (Borenstein, Cohen,
& Rothstein, in press). It also produces Birnbaum's (1961)
"confidence curves," from which can be read all confi-
dence intervals from 50% to 100%, thus obviating the
necessity of choosing a specific confidence level for pre-
As researchers, we have a considerable array of sta-
tistical techniques that can help us find our way to theories
of some depth, but they must be used sensibly and be
heavily informed by informed judgment. Even null hy-
pothesis testing complete with power analysis can be use-
ful if we abandon the rejection of point nil hypotheses
and use instead "good-enough" range null hypotheses
(e.g., "the effect size is no larger than 8 raw score units,
or d = .5), as Serlin and Lapsley (1993) have described
in detail. As our measurement and theories improve, we
can begin to achieve the Popperian principle of repre-
senting our theories as null hypotheses and subjecting
them to challenge, as Meehl (1967) argued many years
With more evolved psychological theories, we can
also find use for likelihood ratios and Bayesian methods
(Goodman, 1993;Greenwald, 1975). We quantitative be-
havioral scientists need not go out of business.
Induction has long been a problem in the philosophy
of science. Meehl (1990a) attributed to the distinguished
philosopher Morris Raphael Cohen the saying "All logic
texts are divided into two parts. In the first part, on de-
ductive logic, the fallacies are explained; in the second
part, on inductive logic, they are committed" (p. 110).
We appeal to inductive logic to move from the particular
results in hand to a theoretically useful generalization.
As I have noted, we have a body of statistical techniques,
that, used intelligently, can facilitate our efforts. But given
the problems of statistical induction, we must finally rely,
as have the older sciences, on replication.
