# Distribution theory

## Simulating correlated standardized mean differences for meta-analysis

As I’ve discussed in previous posts, meta-analyses in psychology, education, and other areas often include studies that contribute multiple, statistically dependent effect size estimates. I’m interested in methods for meta-analyzing and meta-regressing effect sizes from data structures like this, and studying this sort of thing often entails conducting Monte Carlo simulations. Monte Carlo simulations involve generating artificial data—in this case, a set of studies, each of which has one or more dependent effect size estimates—that follows a certain distributional model, applying different analytic methods to the artificial data, and then repeating the process a bunch of times.

## Sampling variance of Pearson r in a two-level design

Consider Pearson’s correlation coefficient, (r), calculated from two variables (X) and (Y) with population correlation (\rho). If one calculates (r) from a simple random sample of (N) observations, then its sampling variance will be approximately [ \text{Var}® \approx \frac{1}{N}\left(1 - \rho^2\right)^2. ] But what if the observations are drawn from a multi-stage sample? If one uses the raw correlation between the observations (ignoring the multi-level structure), then the (r) will actually be a weighted average of within-cluster and between-cluster correlations (see Snijders & Bosker, 2012).

## The multivariate delta method

The delta method is surely one of the most useful techniques in classical statistical theory. It’s perhaps a bit odd to put it this way, but I would say that the delta method is something like the precursor to the bootstrap, in terms of its utility and broad range of applications—both are “first-line” tools for solving statistical problems. There are many good references on the delta-method, ranging from the Wikipedia page to a short introduction in The American Statistician (Oehlert, 1992).

## Alternative formulas for the standardized mean difference

The standardized mean difference (SMD) is surely one of the best known and most widely used effect size metrics used in meta-analysis. In generic terms, the SMD parameter is defined as the difference in population means between two groups (often this difference represents the effect of some intervention), scaled by the population standard deviation of the outcome metric. Estimates of the SMD can be obtained from a wide variety of experimental designs, ranging from simple, completely randomized designs, to repeated measures designs, to cluster-randomized trials.

## The sampling distribution of sample variances

A colleague and her students asked me the other day whether I knew of a citation that gives the covariance between the sample variances of two outcomes from a common sample. This sort of question comes up in meta-analysis problems occasionally. I didn’t know of a convenient reference that directly answers the question, but I was able to suggest some references that would help (listed below). While the students work on deriving it, I’ll provide the answer here so that they can check their work.