On Thursday, March 12, 2020, I was interviewed by a reporter at a newspaper in a mid-sized California city*. The topic was described to me as “a story about the panic associated with the virus — people clearing out stores of toilet paper, bottled water, etc.” I am not a topic expert on COVID-19, but was being interviewed by virtue of my expertise on collective social behavior (I did spend three years as a postdoctoral fellow at Johns Hopkins Med School in the department of Emergency Medicine, where I managed to pick up at least some epidemiology, and I regularly teach modeling contagion).

I was sent a number of questions in advance to prepare, though our conversation was over the phone. When it became apparent that I didn’t think people were overreacting, the reporter indicated that the story would probably be killed rather than reshaped to reflect more accurate information. I urged this person to flip the story and encourage people to adequately respond, but the more we talked, the less likely that seemed. Since the newspaper won’t cover it, I’m posting my brief answers to their questions here. I realize that a lot of people who follow my research will not need this information, but if anyone benefits from it, it seems worth it.

What is driving this behavior?

I think people are responding to a potential threat. There is a lot of uncertainty about what is going on. They are getting some conflicting information, but one common theme is: this is bad, and it is going to get worse before it gets better. From the epidemiological work I’ve managed to survey on COVID-19, I’d say this is accurate. My primary concern is not overreaction on the part of a few stockpilers, it’s underreaction on the part of both individuals and institutions who are not taking this serious. This is shaping up to be the worst pandemic since the 1918 Spanish flu, and could potentially be much worse in terms of both lives lost as well as economic impact. The US needs more measures to ensure social distancing, more and free access to testing and treatment, and better information put out to the public.

Is the panic worse than the threat?

Not by a long shot. This is very serious. We are maybe a couple of weeks behind Italy, and we are showing similar patterns as them and other countries when you control for when the first outbreaks were reported. In fact, the US likely has a lot more infected individuals than the numbers suggest, because of the lack of sufficient testing. In the next few weeks this is likely to get very bad, very quickly. As of today, Italy has over 17,000 confirmed cases with more than 1,200 deaths.

Keep in mind that we are talking about exponential increase. Some of the data from other countries suggest that, at least in the first few weeks, you can approximate the spread as a daily increase of 33%. If you start with just three cases and the number of infected increases by a third every day, in 5 days you have 12 cases. No big deal. In 10 days, you have 52 cases. In 3 weeks, you have over 1,000 cases, and in a month you’ve got 20,000 cases. Two more weeks after that, and you’ve exceeded a million cases. Now obviously there are limits to how high this number can go because of the way populations interact, but the early numbers at least are in line with how the virus has increased in countries that saw earlier outbreaks than us. We had a big lead to prepare for this, and we’ve wasted it.

Why do some people engage in this other behavior while others don’t?

I don’t know. Some people are probably more reactionary, and others are more stubborn. No one likes to have to change or cancel their plans, or be otherwise inconvenienced. Social networks and identities interact with practical and economic concerns.

How does the response to the virus compare to previous public threats (AIDS, 9-11, etc)?

We probably haven’t had a real public threat like this in 100 years. 9/11 was a vicious attack on one place at one time by a group of people who were able to exploit a weakness in a system. Look, I’m a native New Yorker and I lived in NYC in the aftermath of 9/11. The reaction from most people was a mix of solidarity – we’re in this together – and, you know, racism. Most importantly, the threat of it happening again was very small. People seem to dramatically overestimate the threat of terrorism. Even before this pandemic, most people are much more likely to die of disease than of terrorism.

AIDS is caused by a virus, so it’s perhaps more similar to now. During the AIDS crisis, the response was also completely inadequate, both by the public and by the government, and this was largely because it initially was mostly restricted to marginalized populations like gay men. But HIV is relatively difficult to transmit – you can’t catch AIDS from someone if they cough or sneeze on you or something you touch. You can catch Coronavirus that way.

Something else to remember is that the incubation period for something like flu is usually a couple of days. There’s not that much time between when you catch it and when you feel sick. With COVID-19, the time between when you catch it – and can infect others – and start showing symptoms appears to be up to two weeks! I’ve heard people say things like “There are no cases in my town yet” or “there aren’t many cases here.” Two things about this. First, there could be many people infected who aren’t yet symptomatic, who can still infect others. Second, the extent to which the US has failed to adequately test people for COVID-19 cannot be overstated. We need to be testing much, much more. I am confident that if we had an accurate picture of how the infection is spreading here, we’d have justification to be more worried.

Any advice on how not to find the balance between panic and not doing anything?

Don’t panic, but do worry. The best thing you as an individual can do is to avoid crowds, public transportation and airports, and wash your hands well and often. The Coronavirus binds to soap, so soap works better than hand sanitizer. The best thing that society can do is to encourage social distancing. This doesn’t just limit the number of infections. It can dramatically reduce the speed of infection, so that fewer people are sick at any given time. This is really important if we want to avoid overloading our hospitals, our healthcare systems, and to provide more time for new tests and treatments to be developed. It’s completely insane that any large gatherings are happening at all right now.

As for stockpiling – other countries have effectively shut down commerce as the virus has reached peak infection rates. I think it is entirely rational to cache a few weeks worth of food and other supplies. The CDC recommends a month’s worth of supplies.

Stay informed. Social media like Twitter is especially good for this. Follow experts, not pundits or politicians (but hold the politicians accountable).


*The name of the paper isn’t important. No one is perfect, and I don’t this is worth calling someone out on.

COVID-19: Modeling the Flattening of the Curve

It appears that we are in the middle of a global pandemic. COVID-19, caused by the SARS-CoV2 Virus (a form of Coronavirus) is spreading rapidly throughout the world. Some estimates suggest that more than half the world’s population will become infected over the next two years or so. This is serious: COVID-19 appears to be both more contagious and more deadly than most influenza strains, and it appears to be particularly dangerous for older people. While it’s important to note that these estimates are based on emerging and incomplete data, epidemiology is a fairly well developed science with strong methods for calculating these sorts of figures.

So, what can we do? Wash your hands. Cough and sneeze into your elbow. Stay home if you have symptoms. Cancel travel plans and avoid crowds and large gatherings.

I’ve noticed that a point of confusion sometimes arises here. If COVID-19 is already a pandemic, if its spread can’t be stopped, then what’s the point? Aren’t we all going to get infected anyway? Might as well get it over with, right? Especially if you’re young and not in a high-risk category, you might not see the point. In my Twitter feed, the point has been hammered repeatedly: flatten the curve. This figure by Esther Kim and Carl Bergstrom illustrates the meaning.


The left curve is the spread of the disease without interventions. People don’t wash their hands and don’t avoid travel and large gatherings. The disease spreads quickly. So many people are infected at once that hospitals and other healthcare systems are completely overwhelmed. This escalates infection and leads to more deaths among the infected, as well as among those who need care for unrelated illnesses and medical conditions. The right curve is the spread of the disease with interventions. A similar number of people still get infected, but the spread is sufficiently slowed that the number of people infected at a given time is much smaller, hopefully within the capacity of our healthcare systems. Flattening the curve has two major benefits. First, hospitals and other providers are able to handle the number of infected individuals at any given time. And second, there is more time for additional treatments to be developed and tested. Both of these things save lives. We need to flatten the curve.

I think that most people understand the benefits of behaviors like washing your hands and covering your mouth when you sneeze. After all, these are behaviors that help prevent you from getting sick or infecting others. But from talking to some people, it’s not always clear to everyone exactly how collective behaviors like avoiding crowds or canceling travel translates to slowing the spread of disease at the population level. In fact, these things are critically important. I wrote up this post in an attempt to illustrate why.

I teach a class at UC Merced called “Modeling Social Behavior.” And because of that class, I had at the ready a simple model of disease spread. It belongs to a family of models called “SIR models,” where the letters stand for Susceptible (meaning that you are uninfected but susceptible to the disease), Infected (meaning that you are both infected and contagious), and Recovered (meaning that you have either recovered from the illness and are neither susceptible nor contagious, or that you have otherwise been removed from the population). In this model, people (sometimes called “agents”) are situated in and move around on an abstract two-dimensional space. Anytime a susceptible individual is sufficiently close to an infected individual, they become infected with some probability (the transmission rate). An infected individual then recovers with a probability dictated by the disease’s recovery rate. The population looks like this. There are 500 agents. The white ones are susceptible, the red ones are infected, and the grey ones are recovered.


The model is not specific to COVID-19, and ignores aspects like incubation periods. I purposefully did not try to calibrate the transmission or recover rates to COVID-19. You should think about this as a generic model of disease transmission. The model parameter I want to focus on is one related to the agents’ mobility. At each tick of the model’s clock, each agent chooses a random direction to face and takes a step, which allows the agents to mix with each other. In other words, agent movement is just a random walk through space. By controlling the size of the step agents take on each movement, we can control the extent to which they move through social space, and thus the extent of social mixing. Here’s an example with two agents, both of which started in the exact center of the space and took 100 steps, with a line drawn between each location at each tick of the clock. The green agent takes small steps (of size 0.5 units – the entire grid is 51 x 51 units square). The orange agent takes large steps (of size 3 units). As you can see, the orange agent has covered substantially more ground than the green agent in the same amount of time. A population of orange agents, who widely explore their social space, will encounter far more unique individuals in a given time than a comparable population of green agents, who move narrowly and tend to interact with the same individuals over and over.


Whether agents move widely or narrowly in space can dramatically affect the dynamics of disease transmission. I started with a population of 500 uninfected agents and infected three of them at random. I then tracked the number of agents who were infected over time. Here are the results from simulations with narrowly (green) and widely (orange) mobile agents, defined as above as agents with a step size of either 0.5 or 3.0 units, respectively.


These look an awful lot like the “flatten the curve” figure at the top of this post! When agents move widely, the disease spreads very fast, and lots of them are infected at the same time. For this simulation, the maximum infection rate for widely mobile agents was 78%. For the exact same disease, with the same transmission rate and recovery rate, restricting agents to narrow mobility not only stretched out the time course of the epidemic, it also substantially reduced the maximum infection rate to only 25.6% — less than a third of what it was under widely mobile agents.

Here, I’ve plotted the maximum infection rates for a range of step sizes, running 30 simulations for each value. These are shown in red (the line connects the means for each step size). I’ve also plotted the proportion of individuals who never become infected over the course of the epidemic (in blue). Note that while this number is positive for very small step sizes (perhaps representing rapid and effective quarantines), it quickly goes to zero, meaning that everyone eventually becomes infected. But this figure also shows that even if everyone becomes infected eventually, reducing mobility and population mixing—by limiting travel and avoiding large gatherings—can dramatically reduce the number of people who are infected at any given time.


The code to run the model and explore it on your own is available here.The model was made with NetLogo, which you can download for free here. You can also upload the model to NetLogoWeb and launch it in your browser window without installing NetLogo on your computer.

UPDATE: You can now access a version of the model that you can run directly in your browser here.


On Preprints (and Journals)

The good folks at the Academic Life Histories blog asked if I wouldn’t mind contributing some thoughts about preprints.

I’ve been writing scientific papers since 2010, not counting the physics paper I landed on as an undergraduate in 1999. For the last three years, I’ve put almost every paper I’ve written on a preprint server before submitting it to a journal. In certain corners of academia, this fact warrants an explanation. Some want to argue it’s a bad idea, others may be curious, and others may be fully on board but just want to hear another perspective. This is my perspective. Caveat: Some of my characterizations about the process of doing science, or of the peer review experience, may not ring true for some readers. So it goes.

I think preprints are great. A big part of why preprints are great is because they aren’t journal articles. As such, I’m going to start out by talking about the problems with journals and with peer review, and then swing back around to talk about how preprints help us solve some of these problems. I also think journals are still valuable and I don’t want them to go away, and so I view preprints as a valuable complement rather than as a replacement. Here we go. 

Read the rest here.

Bad science evolves

Richard McElreath and I wrote a paper about how incentives to publish can create conditions for the cultural evolution of low-quality research methods. It’s called The Natural Selection of Bad Science (coming soon to an open access journal near you), and it’s already gotten a few write-ups, for which I’m grateful. I mention this because the Society for Personality and Social Psychology (SPSP)’s Character and Context blog asked me to write a post about the paper, which I did. Check it out.

Bad Science Evolves. Stopping It Means Changing Institutional Selection Pressures.

A Theoretical Lens for the Reproducibility Project

Recently, the Open Science Collaboration, a team of over 250 scientists organized by the Center for Open Science, published the results of their Reproducibility Project: Psychology, in which 100 highly visible social psychology studies were replicated. The headline result is that almost two-thirds of the studies failed to find “statistically significant” results. By the standards of the field’s traditional criteria, this means that most of the published studies failed to replicate. The study has been making waves all over the place, and rightly so. This paper represents a tremendous amount of work that inarguably improves what we know and how we think about psychological research, perhaps all scientific research.

Knowing exactly what to make of all this is tricky, however. A number of media outlets cry “Most psychological research is wrong! It’s all bunk!” This is overblown, but it raises the question: what does it all mean? Several excellent scientists have already made valuable contributions to this discussion (notably Michael Frank, Daniel Lakens, Alexander Etz, Lisa Feldman Barrett). Here I add my own.

I am not an experimental psychologist – even though I started my graduate school career doing psychophysics, then animal behavior. I work primarily as a modeler. Last year, Richard McElreath and I developed a mathematical model of scientific discovery. Our goal was to tackle several questions related to replication, publication bias, and the evidential value of scientific results, given that (a) many (perhaps most) novel hypotheses are false, (b) some false positives are inevitable, and (c) some results are more likely to be published than others (there are, of course, other assumptions, but these are the most relevant ones in the context of this post). It is a happy coincidence that our paper was published the day before the Reproducibility Project paper. More so because our model provides a theoretical lens through which to view their results.

Our model focused, in part, on the probability that a hypothesis is true, given a series of positive and negative findings – that is, given some number of successful or unsuccessful replications. I won’t go into detail regarding our model construction or analysis, though I hope that you will read the paper. Rather, I want to share a few thoughts about doing science that came from viewing the Reproducibility Project results through the lens of our mathematical model.

1. We shouldn’t be too surprised that many findings fail to replicate, but we can still do better.

Coming up with testable hypotheses is hard. This point has been made repeatedly over the last decade – if novel hypotheses tend to be wrong, then many results will be false positives, which are (thankfully) likely to fail to replicate. There are two things we can do to improve the situation here.

First, we can try to lower the rate of false positives. Many have suggested pre-registration of hypotheses. On the other hand, exploratory analyses are vital to scientific discovery. A compelling compromise is that researchers should make it crystal clear whether their results followed from an exploration of existing data or came from a test specifically designed to test their a priori hypothesis, in which case pre-registration is desirable. More epistemological weight should be placed on findings of the latter kind. In general, experimental and statistical methods that decrease false positives are a good thing.

Second, we can try to increase the a priori probability that the hypotheses we test are true. As a theorist, it is perhaps unsurprising that my recommendation is: better theory. Specifically, I think psychology should more fully embrace formal modeling, so that its theories are much more precisely specified. There will be some growing pains, but an added benefit of this will be that empirical findings that fit coherent theories will have a long shelf life. As Gerd Gigerenzer has opined, data without theory are like a baby without a parent: their life expectancy is short.

All that said, we shouldn’t take the results of the Reproducibility Project as a dismissal of psychology as a field with poor theory and lots of false positives (although this may be more true in some subfields than in others). False positives can occur under the best of conditions, as can false negatives. For this reason…

2. We shouldn’t put too much stock in any one result.

Science is an imperfect process. A true hypothesis may fail to yield a positive result, and a false hypothesis may appear true given some experimental data. As such, in most cases results should be interpreted probabilistically – the probability that some hypothesis is true given the data. When replication is common, those data will include the results of multiple studies. This would be a very good thing.

Using our model, we analyzed a pessimistic but perhaps not unrealistic scenario in which only one in a thousand tested hypotheses were true, power was 0.6, and the false positive rate was 0.1. A base rate of one in a thousand may seem overly low, but keep in might that this includes each and every hypothesis tested in an exploratory data analysis, that is, every possible association between variables. In that light, a low probability that any one of those associations will really exist may not seem quite as outlandish. Under these conditions, the vast majority of initial positive findings are expected to be false positives. We found that in order to have a greater than 50% confidence that the hypothesis is true, it would need to be successfully replicated three more times. Even if we increase the base rate 100-fold, so that one in ten hypotheses are true, no result that hasn’t been successfully replicated at least once can be trusted with over 50% confidence.

If many replications are needed to establish confidence, then perhaps we shouldn’t cry foul over a single failure to replicate. In some areas of research, most initial results should be viewed with at least some skepticism. This means that the rewards for any novel result, no matter how astonishing, should be moderate. Even more so given the fact that highly surprising results are more likely to be wrong.

3. Replication efforts are valuable even when they are imperfect.

One of the great things about the Reproducibility Project is the extent to which it involved the authors of the original studies being replicated. This is important, because replication efforts have been attacked as a sort of vigilantism, or as the work of dilettantes who lack the expertise or nuance to perform a precise replication. This argument is not without merit. An extreme version holds that failures to replicate are wholly uninformative. This argument is without merit. Our analysis shows quite clearly that the replication efforts are informative even when replications have substantially less power than the initial studies. Power need only be high enough so that true hypotheses are, on average, more likely to yield positive results than negative results. That said, it is a sad truth that this criterion will not always be met.

4. Publishing null results comes with some caveats, but we should almost always publish replication efforts.

Among the forces working against replication efforts is the fact that null results and replications are sometimes difficult to publish. A recent analysis of the “file drawer” effect showed that most null results weren’t published because the authors never bothered to submit them. Our analysis highlights the critical importance of replication in assessing the truth or falsehood of a hypothesis. Several replications may be needed to establish confidence, and that requires that scientists be made aware of efforts to replicate previous findings. All replications should be published. Correspondingly, outlets for publishing those replications are needed, as are incentives to young scientists for authoring them.

On the other hand, it is not clear that publishing absolutely every result is a good thing. If most novel hypotheses are wrong, then most novel results will be correct rejections of those hypotheses. In this case, publishing every result would fill our journals with these true negatives, making it difficult to find the positive results. Even worse would be if substantial replication efforts were devoted to confirming the falsehood of those negative results. Admittedly, this scenario is unlikely – the allure of positive results is just too strong. Even so, our analysis indicates that calls to publish every result come with caveats. A possible solution is the establishment of a repository for very brief reports indicating the failure of experimental tests to yield positive results. Such a repository would be easily searchable, avoid clogging up journals, and require minimal effort on the part of busy scientists with ticking tenure clocks.


I have kept this discussion qualitative, and have purposely avoided mathematical or statistical details in order to maximize generality and accessibility. There are lots of important points to be made regarding methodology, replication, and publication bias that I have sidestepped. Hopefully it has been useful nevertheless.

Interactional Complexity and Human Societies

We are interested in understanding various aspects of human societies.Since the structural and functional behavior of human societies undoubtedly qualifies as a complex system, it is useful to discuss certain terminology and philosophical concepts related to the organization of complex systems. William Wimsatt’s (1974) notion of interactional complexity will be particularly useful but is not widely appreciated, and so I will go into some detail to clarify this concept.

Decompositions and descriptive complexity

Stuart Kauffman (1971) presciently noted that, when describing a complex system, different descriptions of the system and resulting articulations of parts, or decompositions, might be varyingly useful depending on the purpose of the analysis, and that these descriptions might be non-isomorphic. That is, the delineations of the constituent parts may not coincide between different decompositions.

Wimsatt’s major insight was to note that relationships between the different decompositions of a system could be used to denote their intrinsic complexity. As an example, he compared a chunk of granite with the fruit fly Drosophila melanogaster (see Fig. 1 in Wimsatt, 1974). The chunk of granite can be described via a decomposition into parts grouped by (for example) chemical composition, thermal conductivity, electrical conductivity, density, or tensile strength. Although these decompositions are not completely isomorphic, some of the boundaries between parts are shared between each description (e.g., a section with a specific density will also have a specific tensile strength and chemical composition relative to the neighboring parts). The fruit fly, meanwhile, can also be described by decomposition into parts based on (for example) anatomical organs, cell types, developmental gradients, biochemical reactions (i.e., the local presence of reaction types), or physiological systems (as described by cybernetic flow diagrams). In contrast to the granite chunk, the boundaries between the parts of the various decompositions are not spatially coincident, and indeed, the last two items on the list are not evenly clearly describable in a coherent spatial manner. Wimsatt introduced the term descriptive complexity to indicate the degree to which the spatial boundaries of various descriptive decompositions coincide. A fruit fly is thus more descriptively complex than a chunk of granite.

Interactional complexity

A system can often be described in terms of subsystems, each of which has a specific set of parts. We can constrain this description by specifying that, for the parts within these subsystems, the causal relations with other parts within the subsystem should be much stronger than the causal relations with parts from other subsystems. Indeed, this constraint helps delineate each subsystem from the others, and might be seen as the degree to which a valid prediction of the system behavior could assessed by only considering the behavior of each subsystem, ignoring interactions between them. Remember, however, that there may be many useful decompositions of the system into subsystems, each with its own set of constituent parts.

We say that a system under these constraints is interactionally simple if there are only weak causal relationships between the parts of a subsystem in one decomposition and the parts of a different subsystem in a different decompositional description, and interactionally complex to the degree to which those causal relations are strong. Put more bluntly, a system has a high degree of interactional complexity if an investigator must consider the system from more than one theoretical perspective (i.e., more than one decomposition) in order to make useful predictions. Driving the point home, Wimsatt writes, “If the system is descriptively complex and is also interactionally complex for more than a very small number of interactions, the investigator is forced to analyze the relations of parts for virtually all parts in the different decompositions, and probably even to construct connections between the different perspectives at the theoretical level.” (1974, p. 74). Forty years after Wimsatt’s paper first appeared, this idea may no longer be revelatory, but I maintain that it is still underappreciated.

Human societies are interactionally complex

It seems obvious that human societies are descriptively complex. We can describe societies at the level of individuals, in terms of nuclear families, kin groups, subcultures, and social classes. We can also include infrastructure and transportation, livestock and farming, religious rituals and linguistic traditions. This is all on top of the descriptive complexity of an individual human, which I believe we can agree is at least as great as that of a fruit fly.

Importantly, human societies, and the human groups that comprise societies, are also interactionally complex. Perspectives include genetic, neurological, cognitive, familial, cultural, and ecological. To at least some extent, we can’t ignore any of them.


  • Kauffman, S. A. (1970). Articulation of parts explanation in biology and the rational search for them. In: PSA 1970, ed. R. C. Buck & R. S. Cohen, pp. 257–72. Philosophy of Science Association.
  • Wimsatt, W. C. (1974). Complexity and organization. In: PSA 1972, ed. K. Schaffner & R. S. Cohen, pp. 67–86. Philosophy of Science Association.