# Analysis of Biological Data

## Third EditionMichael C. Whitlock; Dolph Schluter

©2020**Keep this book for all of your writing and research—in college and beyond.**

Whether you’re completing an assignment for your English class or biology class or a project for your business internship or your job, A Pocket Style Manual has the grammar and style advice to help you succeed. And having 325 documentation models in MLA, APA, Chicago, and CSE styles at your fingertips means you can easily and accurately cite any articles, videos, Web sites, and other sources you use to support your ideas.

## Table of Contents

1.0 Statistics and samples

1.1 What is statistics?

1.2 Sampling populations

1.3 Types of data and variables

1.4 Frequency distributions and probability distributions

1.5 Types of studies

1.6 Summary

Interleaf 1 Correlation does not require causation

2.0 Displaying data

2.1 Guidelines for effective graphs

2.2 Showing data for one variable

2.3 Showing association between two variables and differences between groups

2.4 Showing trends in time and space

2.5 How to make good tables

2.6 How to make data files

2.7 Summary

3.0 Describing data

3.1 Arithmetic mean and standard deviation

3.2 Median and interquartile range

3.3 How measures of location and spread compare

3.4 Cumulative frequency distribution

3.5 Proportions

3.6 Summary

3.7 Quick Formula Summary

4.0 Estimating with uncertainty

4.1 The sampling distribution of an estimate

4.2 Measuring the uncertainty of an estimate

4.3 Confidence intervals

4.4 Error bars

4.5 Summary

4.6 Quick Formula Summary

Interleaf 2 Pseudoreplication

5.0 Probability

5.1 The probability of an event

5.2 Venn Diagrams

5.3 Mutually exclusive events

5.4 Probability distributions

5.5 Either this or that: adding probabilities

5.6 Independence and the multiplication rule

5.7 Probability trees

5.8 Dependent events

5.9 Conditional probability and Bayes' theorem

5.10 Summary

6.0 Hypothesis testing

6.1 Making and using hypotheses

6.2 Hypothesis testing: an example

6.3 Errors in hypothesis testing

6.4 When the null hypothesis is not rejected

6.5 One-sided tests

6.6 Hypothesis testing versus confidence intervals

6.7 Summary

Intereaf 3 Why statistical significance is not the same as biological importance

PART 2 PROPORTIONS AND FREQUENCIES

7.0 Analyzing proportions

7.1 The binomial distribution

7.2 Testing a proportion: the binomial test

7.3 Estimating proportions

7.4 Deriving the binomial distribution

7.5 Summary

7.6 Quick Formula Summary

Interleaf 4 Biology and the history of statistics

8.0 Fitting probability models to frequency data

8.1 X^2 goodness-of-fit test: the proportional model

8.2 Assumptions of the X^2 goodness-of-fit test

8.3 Goodness-of-fit tests when there are only two categories

8.4 Random in space or time: the Poisson distribution

8.5 Summary

8.6 Quick Formula Summary

Interleaf 5 Making a plan

9.0 Contingency analysis: Associations between categorical variables

9.1 Associating two categorical variables

9.2 Estimating association in 2 × 2 tables: relative risk

9.3 Estimating association in 2x2 tables: the odds ratio

9.4 The x^2 contingency test

9.5 Fisher's exact test

9.6 Summary

9.7 Quick Formula Summary

PR1 Review Problems 1

PART 3 COMPARING NUMERICAL VALUES

10.0 The normal distribution

10.1 Bell-shaped curves and the normal distribution

10.2 The formula for the normal distribution

10.3 Properties of the normal distribution

10.4 The standard normal distribution and statistical tables

10.5 The normal distribution of sample means

10.6 Central limit theorem

10.7 Normal approximation to the binomial distribution

10.8 Summary

10.9 Quick Formula Summary

Interleaf 6 Controls in medical studies

11.0 Inference for a normal population

11.1 The t-distribution for sample means

11.2 The confidence interval for the mean of a sample distribution

11.3 The one-sample t-test

11.4 Assumptions of the one-sample t-test

11.5 Estimating the standard deviation and variance of a normal population

11.6 Summary

11.7 Quick Formula Summary

12.0 Comparing two means

12.1 Paired sample versus two independent samples

12.2 Paired comparison of means

12.3 Two-sample comparison of means

12.4 Using the correct sampling units

12.5 The fallacy of indirect comparison

12.6 Interpreting overlap of confidence intervals

12.7 Comparing variances

12.8 Summary

12.9 Quick Formula Summary

Interleaf 7 Which test should I use?

13.0 Handling violations of assumptions

13.1 Detecting deviations from normality

13.2 When to ignore violations of assumptions

13.3 Data transformations

13.4 Nonparametric alternatives to one-sample and paired t-tests

13.5 Comparing two groups: the Mann-Whitney U-test

13.6 Assumptions of nonparametric tests

13.7 Type I and Type II error rates of nonparametric methods

13.8 Permutation tests

13.9 Summary

13.10 Quick Formula Summary

RP2 Review Problems 2

14.0 Designing experiments

14.1 Lessons from clinical trials

14.2 How to reduce bias

14.3 How to reduce the influence of sampling error

14.4 Experiments with more than one factor

14.5 What if you can't do experiments?

14.6 Choosing a sample size

14.7 Summary

14.8 Quick Formula Summary

Interleaf 8 Data dredging

15.0 Comparing means of more than two groups

15.1 The analysis of variance

15.2 Assumptions and alternatives

15.3 Planned comparisons

15.4 Unplanned comparisons

15.5 Fixed and random effects

15.6 ANOVA with randomly chosen groups

15.7 Summary

15.8 Quick Formula Summary

Interleaf 9 Experimental and statistical mistakes

PART 4 REGRESSION AND CORRELATION

16.0 Correlation between numerical variables

16.1 Estimating a linear correlation coefficient

16.2 Testing the null hypothesis of zero correlation

16.3 Assumptions

16.4 The correlation coefficient depends on the range

16.5 Spearman's rank correlation

16.6 The effects of measurement error on correlation

16.7 Summary

16.8 Quick Formula Summary

Interleaf 10 Publication bias

17.0 Regression

17.1 Linear Regression

17.2 Confidence in predictions

17.3 Testing hypotheses about a slope

17.4 Regression toward the mean

17.5 Assumptions of regression

17.6 Transformations

17.7 The effects of measurement error on regression

17.8 Regression with nonlinear relationships

17.9 Logistic regression: fitting a binary response variable

17.10 Summary

17.11 Quick Formula Summary

Interleaf 11 Meta-analysis

RP3 Review Problems 3

PART 5 MODERN STATISTICAL METHODS

18.0 Multiple explanatory variables

18.1 ANOVA and linear regression are linear models

18.2 Analyzing experiments with blocking

18.3 Analyzing factorial designs

18.4 Adjusting for the effects of a covariate

18.5 Assumptions of general linear models

18.6 Summary

Interleaf 12 Using species as data points

19.0 Computer-intensive methods

19.1 Hypothesis testing using simulation

19.2 Bootstrap standard errors and confidence intervals

19.3 Summary

20.0 Likelihood

20.1 What is the likelihood?

20.2 Two uses of likelihood in biology

20.3 Maximum likelihood estimation

20.4 Versatility of maximum likelihood estimation

20.5 Log-likelihood ratio test

20.6 Summary

20.7 Quick Formula Summary

21.0 Survivorship analysis

21.1 Survival curves

21.2 Comparing two survival curves

21.3 Summary

21.4 Quick Formula Summary

BACK MATTER

Statistical tables

Literature cited

Answers to practice problems

Index