As I’ve discussed in previous posts, meta-analyses in psychology, education, and other areas often include studies that contribute multiple, statistically dependent effect size estimates. I’m interested in methods for meta-analyzing and meta-regressing effect sizes from data structures like this, and studying this sort of thing often entails conducting Monte Carlo simulations. Monte Carlo simulations involve generating artificial data—in this case, a set of studies, each of which has one or more dependent effect size estimates—that follows a certain distributional model, applying different analytic methods to the artificial data, and then repeating the process a bunch of times.
Rmarkdown documents now have a very nifty code folding option, which allows the reader of a compiled html document to toggle whether to view or hide code chunks. However, the feature is not supported in blogdown, the popular Rmarkdown-based website/blog creation package. I recently ran across an implementation of codefolding for blogdown, developed by Sébastien Rochette. I have been putzing around, trying to get it to work with my blog, which uses the Hugo Academic theme—alas, to no avail.
At AERA this past weekend, one of the recurring themes was how software availability (and its usability and default features) influences how people conduct meta-analyses. That got me thinking about the R packages that I’ve developed, how to understand the extent to which people are using them, how they’re being used, and so on. I’ve had badges on my github repos for a while now:
clubSandwich: ARPobservation: scdhlm: SingleCaseES: These statistics come from the METACRAN site, which makes available data on daily downloads of all packages on CRAN (one of the main repositories for sharing R packages).
About one year ago, the nlme package introduced a feature that allowed the user to specify a fixed value for the residual variance in linear mixed effect models fitted with lme(). This feature is interesting to me because, when used with the varFixed() specification for the residual weights, it allows for estimation of a wide variety of meta-analysis models, including basic random effects models, bivariate models for estimating effects by trial arm, and other sorts of multivariate/multi-level random effects models.
In today’s Quant Methods colloquium, I gave an introduction to the logic and purposes of Monte Carlo simulation studies, with examples written in R.
Here are the slides from my presentation. You can find the code that generates the slides here. Here is my presentation on the same topic from a couple of years ago. David Robinson’s blog has a much more in-depth discussion of beta-binomial regression. The data I used is from Lahman’s baseball database.
I have recently been working to ensure that my clubSandwich package works correctly on fitted lme and gls models from the nlme package, which is one of the main R packages for fitting hierarchical linear models. In the course of digging around in the guts of nlme, I noticed a bug in the getVarCov function. The purpose of the function is to extract the estimated variance-covariance matrix of the errors from a fitted lme or gls model.
Hadley Wickham’s dplyr and tidyr packages completely changed the way I do data manipulation/munging in R. These packages make it possible to write shorter, faster, more legible, easier-to-intepret code to accomplish the sorts of manipulations that you have to do with practically any real-world data analysis. The legibility and interpretability benefits come from
using functions that are simple verbs that do exactly what they say (e.g., filter, summarize, group_by) and chaining multiple operations together, through the pipe operator %>% from the magrittr package.