The sub-title of Jim Manzi’s “Uncontrolled” is “The Surprising Payoff of Trial-and-Error for Business, Politics, and Society”, but multiple passages of the book actually consists of caution how small such payoffs can be. The sociologist Peter Rossi formulated the “Iron Law of Evaluation“: The expected value of any net impact assessment of any large scale social program is zero. Manzi’s background is in consulting for business rather than social policy, but the same logic applies in that there are abundant ideas undertaken because they sounded good when an evaluation would show them to have little effect. Manzi phrases things differently: he says questions of human behavior are plagued by high “causal density“, in contrast to the simplicity of questions in physics which can be controlled in a lab. Mencius Moldbug would claim this is why one must then rely on “wisdom” rather than the “cargo cult science” found in academia, but I find Manzi more persuasive. Reality is one and our methods of obtaining knowledge can work in other fields, even if it is more difficult (as Manzi phrases it: “The experimental revolution is like a huge wave that has lost power as it moved uphill through topics of increasing complexity and holism”). This book isn’t an in-depth introduction to epistemology & the philosophy of science, but it does provide a bit of an intro so a layman can understand that such issues exist.

I’ve read this book years after it was published, with the replication crisis having hit social science. I’ve been reading Andrew Gelman on that, who cautions people on the limits of randomized control trials. In brief, academics who want to rank up publications can run multiple analyses on data in order to attain “statistical significance”, and the practice of using insufficiently large samples makes it all the more likely that “noise” will dominate any replicable effect (Gelman here would interject to emphasize the importance of better measurement & theory). Manzi cites John Ioannidis’ 2005 paper on highly-cited studies that failed to replicate, but distinguishes between the 80 percent of non-randomized studies that did with the 10 percent of “large RFTs”, without explaining how large they have to be.

One possible hindrance to people with a general interest in the topic is that Manzi explicitly ties these ideas with a political viewpoint (some may simply see who’s quoted blurbing the book and conclude it’s not for them). He views a free-market as enabling many people to experiment, since we don’t know in advance what works best (and gives his own history with “software as a service” as an example of something that emerged over time due to particular circumstances rather than thought up by some brilliant thinker). But since we don’t know in advance, he’s not advocating for any sort of purist libertarianism but rather having a mild preference for it with a Burkean conservative openness to other policies deemed useful. After I stopped believing in objective/natural rights I took meta-libertarianism to an extreme of radical decentralism, and rather than make a principled argument against that (perhaps because he’s not facing such an argument for it) he merely says that voters won’t be “Libertarian New Men” willing to accommodate such radical compromises with shared moral values and are likely to insist on some degree of national uniformity. Most of the book isn’t about his political philosophy though and he tries to list every possible factor limiting a proposed decentralist/experimentalist agenda in a way resembling the hedging in Crime and Human Nature (which I found disappointingly excessive and resembling a literature review after reading the relatively clear thesis of The Bell Curve).

Before he gets to that proposed agenda, there’s a lot of material on why we’d want that in the first place. He gives an acknowledgement of the usefulness of non-experimental research (like Gelman’s “exploratory” data analysis) given that experiments can be expensive and recommends that such cheaper options be used first, but thinks that certain matters are too complex to have much confidence in our understanding of them. Most people have heard the example from chaos theory of a “butterfly flapping its wings in China”, but it wasn’t until reading this book that I learned the meteorologist Edward Lorenz came up with it to describe the effect of very minute alterations to a parameter in a climate model, resulting in very different long-run forecasts. Manzi has elsewhere stated he’s skeptical of most plans to address climate change since the time scales being discussed make predictions very uncertain, but that greater economic growth will result in more options to use our wealth down the line. It’s a stance that might cost him an audience among people who perceive themselves as “pro-science”. I don’t think Steve Levitt is quite so high-status these days that many would be miffed at the shots Manzi takes at his paper with Donohue on abortion (which actually compares the effect to that of the proverbial butterfly). People suspicious of Manzi’s political stance (which is more oriented toward economic than social conservatism) may be less inclined to listen to him when discussing how Larry Bartels’ analysis of the effect of Republican presidencies on inequality is vulnerable to slightly different assumptions, but I hope fair-minded readers get the same general idea. Reinhart & Rogoff’s paper on public debt & economic growth is notorious among left of center econ types, but excessive attention gets paid to the Excel coding error (one was also in Levitt & Dubner’s original paper), when the larger effects are actually from different assumptions about which data to include and how it should be analyzed. And of course the biggest problem is that it could only hope to show correlation rather than causality and there some obvious reasons to expect poor growth to cause higher debt. Establishing causality rather than correlation is one of the biggest benefits of controlled experimentation, so this could have been a good example for Manzi to highlight.

Politics is notoriously the realm of the irrational & negligent (of course, consisting entirely of your political opponents) who make reckless assertions rather than hedging their bets and collecting evidence. But Manzi notes that there was something of a “golden era” in the 60s & 70s when a number of policy experiments were conducted. The most remembered now might be the one involving a universal basic income/negative income tax, which was found to decrease the supply of labor by much more than its proponents expected. That many experiments didn’t show a way to improve the status quo may be part of why experiments fell out of favor (although if there’s one specific person blamed, it’s Reagan for cutting the budgets available for such experiments). The last gasp of such policy experimentation was during the era of welfare reform. Randomized experimentation came to be a condition for the waivers given to states. This ended in 1996 (after the conclusive finding that mandatory work requirements were the only factor which seemed to get people off welfare and back in the workforce), which granted states broad degrees of discretion without the need for waivers. It was relatively easy to conduct experiments on welfare & criminology* compared to, say, education, because of “the greater political power of teachers and parents than of indigents and criminals”. The Department of Education actually does have a small “What Works Clearinghouse”, but educators tend to ignore the result of experiments (on highlighting, Direct Instruction, or anything else). The recent growth in magnet & charter schools has resulted in more randomized trials, because many allot slots by lottery (magnets don’t have the same positive results as charters, so Manzi concludes collective bargaining agreements are responsible for the difference). From the experiments on the difficult subject of human behavior that have been conducted, Manzi concludes that changing skills/consciousness is harder than altering incentives or environment, but his meta-conclusion of these studies is that there haven’t been nearly enough of them. His comparison to the huge number of A/B tests conducted every year by Capital One seems a bit unfair, though he argues that organizations which conduct more RFTs develop ways to do them more efficiently & cheaply.
*The only concept in criminology that Manzi found passed every trial (of which there were only two) was “nuisance abatement”, aka getting property owners to fix the titular broken windows.

His grand plan is the re-institutionalize the welfare reform era’s pairing of waivers with randomized experimentation for a wider array of policies. He wants an agency analogous to the FDA to “to develop, promulgate, and enforce standards for designing and interpreting social-policy randomized experiments”. It would also be analogous to the CBO though, in that it would be tasked with “scoring” experiments for adhering to scientific standards rather than forbidding them (I was disappointed that Manzi didn’t specifically mention pre-registration as one of the factors that should be required). Some other comparisons are the Office of Technology Assessment or the Department of Justice’s National Institute of Justice. I could go on to discuss some of his more specific policy proposals for education, immigration and safety net/risk pooling/redistribution, but the meta point is more important. He’s not radical enough to propose full-scale futarchy, but it doesn’t seem too much to hope that opponents of any contentious policy shift start demanding experiments beforehand, if only to slow things down. An arms race could then develop in which both sides make such demands, which would at least result in fewer half-baked ideas becoming countrywide uncontrolled experiments even if it doesn’t lead to an ideal of adversarial collaboration and more rigorous disagreement.