[Book Review] Hypothesis Testing, Jim Frost

An excellent beginner's handbook on the basic mechanics of hypothesis testing.

by Mojan Benham

A year ago

Two-line summary

5 minute read

The truest form of 'back to basics', this book uses conversational explanations (devoid of the underlying math) to deconstruct hypothesis testing into its most atomic components. Accessible, casual tone, broad strokes approach for readers looking to gain an intuitive understanding of experiment fundamentals.

My rating: 4/5 stars

Review

Topics covered

Sections are written in order of increasing complexity starting with an overview of the basics, followed by a chapter dedicated to each individual concept in greater detail. Topics include:

Test statistics and their sampling distributions
P-values
Confidence intervals
Type I & II error and statistical power
Central limit theorem

The focus then turns to more advanced topics in lesser depth, serving as an introduction for potential future reading:

ANOVA, Poisson, proportions and chi-square tests
Discussions on ordinal, binary, categorical and count data
Pearson's correlation

Pros and cons

All in all, this book has my stamp of approval for being fantastic elementary content. The sequence in which ideas are presented is masterfully chosen and it has the perfect balance of breadth and depth.

The author puts great care into ensuring that each idea is built on a strong foundation of intuition. He is clear, concise and skilled at distilling complex mathematical theorems in a way that frames them as common sense. He even uses mnemonics throughout the book to help the reader remember statistical references. ("If the p-value is low, the null must go!")

Further, Frost is consistent at maintaining the standard of no presumed knowledge. Rather than captioning graphs, he walks through them as an instructor would, emphasizing its notable points. If a parameter is the divisor in a formula, he explains that increasing the parameter decreases the overall value because it appears in the denominator. It may seem like overkill, but it takes a great deal of diligence to eliminate the use of self-evident math.

The only con worth pointing out is that this book is self-published, which is often glaringly evident in the lack of refinement in its editing. The content - while easy to understand - is presented with circular, repetitive language that hinders clarity. Take this example from Chapter 7 on the central limit theorem, where a single idea is stated multiple times with distinction, yet without difference:

Paragraph 1: "As the sample size increases, the sampling distribution converges on a normal distribution where the mean equals the population mean..."

Paragraph 2: "As the sample size (n) increases, the standard deviation of the sampling distribution becomes smaller because the square root of the sample size is in the denominator."

Paragraph 3: "As sample size increases, the sampling distribution more closely approximates the normal distribution..."

This is distracting to the reader who must now reread the section to make sense of the nuanced tweaks.

There are a few other areas that are impacted by editing but of lesser importance: the table of contents is not formatted, the tone oscillates between academic and casual, mathematical symbols printed improperly, etc.

Recommended audience

This book is well suited for beginners who are seeking a fundamental understanding of the mechanics of hypothesis tests. It intentionally does not adequately cover designing a robust experiment (defining a randomization unit, preventing SUTVA, assessing for survivorship bias, etc), building an online platform framework (sample ratio mismatch, A/A tests, etc) and other finer detail material. In other words it will teach you how to properly read and assess the results of an experiment without becoming an expert on how to plan and execute them.

As such, Hypothesis Testing is appropriate for entry-level data scientists, statisticians and researchers with minimal prior exposure to this subject matter. It would also serve well as an accessible guide for product managers. A decent litmus test: are you able to explain how to construct a probability distribution from a set of trials and explain where a p-value would fall on that graph? If so, your understanding exceeds the scope of this text.

For more complex, comprehensive coverage, I would recommend the following (in order of increasing difficulty):

Trustworthy Online Controlled Experiments by Ron Kohavi et al
Design and Analysis of Experiments by Douglas C. Montgomery
Statistics for Experimenters - Design, Innovation and Discovery by George E. P. Box, J. Stuart Hunter, William G. Hunter

Reader prerequisites

The beauty of this book is that it is intended for readers with even a passing familiarity of basic statistics. It's ideal if you are someone who vaguely remembers their university Stats101 course but not enough to explain the concepts in great detail. For example, you know the context in which p-values are used and perhaps that smaller means better but not what they represent or how they are derived.

If you have no prior post-secondary knowledge of statistics, you would likely fare well with the occasional google search to brush up on terminology, but to maximize the value of the topics covered I would recommend starting with a statistical primer. The author of this book does have a prequel called Introduction to Statistics (which I have not read and thus cannot vouch for). Alternatively I would recommend the first four chapters of Probability and Statistics for Engineering and the Sciences by Jay L. Devore.

Most notable excerpts

"To understand why we don't accept the null, consider that you can't prove a negative. A lack of evidence isn't proof that something doesn't exist." (Chapter 2: T-Test Uses, Assumptions and Analyses)
"Extraordinary claims require extraordinary evidence - consider the plausibility of the alternative hypothesis in conjunction with the p-value." (Chapter 4: Interpreting P-values)
"The significance level is an evidentiary standard that you set to determine whether your sample data are strong enough to reject the null hypothesis. [...] You set this value based on your willingness to risk a false positive." (Chapter 5: Types of Errors and Statistical Power)
"Non-significant findings are systematically on the low end of the distribution. By filtering results on statistical significance, you exclude these smaller effects when calculating the mean, which biases the mean upwards." (Chapter 5: Types of Errors and Statistical Power)
"As sample size increases, the sampling distribution more closely approximates the normal distribution, and the spread of that distribution tightens." (Chapter 7: Sample Size Considerations)
"Normal distributions can adequately approximate the binomial distribution when the event probability is near 0.5 or when the number of trials is large." (Chapter 11: Binary Data and Testing Proportions)

As always, would love to hear your thoughts and feedback in the comments below.

Shopping Cart

Recommended Reading

[Book Review] Hypothesis Testing, Jim Frost

Two-line summary

Table of contents

Review

Topics covered

Pros and cons

Recommended audience

Reader prerequisites

Most notable excerpts

0 comments

Leave a comment

Main menu