Last night I attended a joint meetup between the Austin R User Group and R Ladies Austin, which was great fun. The evening featured several lightning talks on a range of topics, from breaking into data science to network visualization to starting your own blog. I gave a talk about sandwich standard errors and my clubSandwich R package. Here are links to some of the talks:
Caitlin Hudon: Getting Plugged into Data Science Claire McWhite: A quick intro to networks Nathaniel Woodward: Blogdown Demo!
A colleague recently asked me about how to apply cluster-robust hypothesis tests and confidence intervals, as calculated with the clubSandwich package, when dealing with multiply imputed datasets. Standard methods (i.e., Rubin’s rules) for pooling estimates from multiple imputed datasets are developed under the assumption that the full-data estimates are approximately normally distributed. However, this might not be reasonable when working with test statistics based on cluster-robust variance estimators, which can be imprecise when the number of clusters is small or the design matrix of predictors is unbalanced in certain ways.
In many systematic reviews, it is common for eligible studies to contribute effect size estimates from not just one, but multiple relevant outcome measures, for a common sample of participants. If those outcomes are correlated, then so too will be the effect size estimates. To estimate the degree of correlation, you would need the sample correlation among the outcomes—information that is woefully uncommon for primary studies to report (and best of luck to you if you try to follow up with author queries).
cluster-robust variance estimation.
I’ve recently been working with my colleague Beth Tipton on methods for cluster-robust variance estimation in the context of some common econometric models, focusing in particular on fixed effects models for panel data—or what statisticians would call “longitudinal data” or “repeated measures.” We have a new working paper, which you can find here.
The importance of using CRVE (i.e., “clustered standard errors”) in panel models is now widely recognized. Less widely recognized, perhaps, is the fact that standard methods for constructing hypothesis tests and confidence intervals based on CRVE can perform quite poorly in when you have only a limited number of independent clusters.
I’ve recently been working on small-sample correction methods for hypothesis tests in linear regression models with cluster-robust variance estimation. My colleague (and grad-schoolmate) Beth Tipton has developed small-sample adjustments for t-tests (of single regression coefficients) in the context of meta-regression models with robust variance estimation, and together we have developed methods for multiple-contrast hypothesis tests. We have an R package (called clubSandwich) that implements all this stuff, not only for meta-regression models but also for other models and contexts where cluster-robust variance estimation is often used.
In an earlier post about sandwich standard errors for multi-variate meta-analysis, I mentioned that Beth Tipton has recently proposed small-sample corrections for the covariance estimators and t-tests, based on the bias-reduced linearization approach of McCaffrey, Bell, and Botts (2001). You can find her forthcoming paper on the adjustments here. My understanding is that these small-sample corrections are important because the uncorrected sandwich estimators can lead to under-statement of uncertainty and inflated type I error rates when a given meta-regression coefficient is estimated from only a small or moderately sized sample of independent studies (or clusters of studies).
In a previous post, I provided some code to do robust variance estimation with metafor and sandwich. Here’s another example, replicating some more of the calculations from Tanner-Smith & Tipton (2013). (See here for the complete code.)
As a starting point, here are the results produced by the robumeta package:
library(grid) library(robumeta) data(corrdat) rho <- 0.8 HTJ <- robu(effectsize ~ males + college + binge, data = corrdat, modelweights = "CORR", rho = rho, studynum = studyid, var.
A common problem arising in many areas of meta-analysis is how to synthesize a set of effect sizes when the set includes multiple effect size estimates from the same study. It’s often not possible to obtain all of the information you’d need in order to estimate the sampling covariances between those effect sizes, yet without that information, established approaches to modeling dependent effect sizes become inaccurate. Hedges, Tipton, & Johnson (2010, HTJ hereafter) proposed the use of cluster-robust standard errors for multi-variate meta-analysis.