Based on a PsyArXiv preprint with the admittedly slightly provocative title “Why most experiments in psychology failed: sample sizes required for randomization to generate equivalent groups as a partial solution to the replication crisis” a modest debate erupted on Facebook (see here; you need to be in the PsychMAP group to access the link, though) and Twitter (see here, here, and here) regarding randomization.
John Myles White was nice enough to produce a blog post with an example of why Covariate-Based Diagnostics for Randomized Experiments are Often Misleading (check out his blog; he has other nice entries, e.g. about why you should always report confidence intervals over point estimates).
I completely agree with the example he provides (except that where he says ‘large, finite population of N people’ I assume he means ‘large, finite sample of N people drawn from an infinite population’). This is what puzzled me about the whole discussion. I agreed with (almost all) arguments provided; but only a minority of the arguments seemed to concern the paper. So either I’m still missing something, or, as Matt Moehr ventured, we’re talking about different things.
So, hoping to get to the bottom of this, I’ll also provide an example. It probably won’t be as fancy as John’s example, but I have to work with what I have 🙂