Thinking about thoughts, fourth downs, and the nature of evidence

belichickWhen it happened, I knew the Belichick story would be big, but I think few could have anticipated the shape or dimension of the conversation. Some of this I credit to the rise of new media: The immediate reaction to the call on NBC and ESPN was: Bad, awful, stupid call. But there was an undercurrent chorus of, “Hey, wait a minute. It actually kind of made sense.” I’d like to count myself as part of that chorus, but clearly the guy who quite nearly turned the entire debate on its head was my friend and New York Times co-blogger Brian Burke, whose post on Belichick’s call was cited everywhere from ESPN apparatchik Adam Shefter’s twitter feed to a piece by the excellent (and decidedly mainstream) Joe Posnanski on SI.com. (I’d like to think I helped, as I linked to Brian’s bit within about a half hour after the game, and my tweet of his piece was one of the most retweeted things I’ve ever sent.)

Credit where it is due, the interesting thing is what happened after that: A mess. Some people ossified in their views: Trent Dilfer tried to back up his bombastic criticism of Belichick, though he had more passion than arguments. Peter King said the call “smacked of I’m-smarter-than-they-are hubris,” and compared Belichick to Grady Little. In the process, King messed up his math, but that was really besides the point for him. The call just didn’t feel right.

Although some stats junkies went the other way and proclaimed that it would have been affirmatively stupid for Belichick to have punted, most people, when faced with the compelling statistical evidence that the odds were roughly in Belichick’s favor (or at least so close as to be even with all the late game variables at play), were left in a fit of consternation. And this is why I think the decision has struck a national chord. It gets to the core of how people see themselves versus how they actually make decisions.

Most people fancy themselves as being driven by the evidence such that they will always follow it, but that’s not really true. As amazing and wonderful as the human brain is, it is full of inherent biases, and information, even compelling information, that does not comport with those biases is often devalued, even on a subconcious level. (One famous experiment confronts people with radios where the speaker is discussing views contrary to or similar to those already held by the listener, but the volume is set too low to be heard well. The listeners frequently turn up the volume when the speaker is saying things they already believe; they rarely turn the volume up if the speaker is discussing the contrary views.)

And so it was with the Belichick debate. It’s not that you must agree with the decision, but any reasonable person has to say, as Posnanski did, “Well, hmm, it seemed nuts at the time but I get it now, based on the evidence.” As Keyes said, “When the facts change, I change my mind – what do you do, sir?” Yet many people still refuse to reconsider their view on the subject. It was wrong and no degree of evidence can change my view or even make me reconsider. Consider Colin Cowherd’s admonition on SportsNation that “stats are overrated.” (Though I agree that many stats are.) The upshot is that, despite our best views of ourselves, it is very difficult to actually say that we are rational creatures in practice. As Jonah Lehrer wrote:

The reason I bring up this analysis is to demonstrate that even defensible decisions can have wrenching emotional consequences. Belichick’s call might have been statistically correct, but it felt horribly wrong.

. . . The point is that there’s often an indefatigable gap between the rigors of cost-benefit analyses and the emotional hunches that drive our decisions. We say we want to follow the evidence, but then the evidence rubs against a bias like loss aversion, and so we make an exception. We’ll follow the evidence next time.

It’s not really fair to pick on Tony Dungy, who was an excellent football coach, because his excellence had nothing to do with any training in statistics or probability. But his comment that “you have to play the percentages and punt” is symptomatic of a wider issue, which is that when something “feels horribly wrong” we inherently want the evidence to comport with that feeling and we convince ourselves that it does. Dungy is a conservative guy, he likely would say that punting gives him plenty of chances to win, he’s a defensive coach so he has no qualms about showing faith in his defense, and, bottom line, the idea of putting that much significance on one play just didn’t sit well with him. That’s all fine, but it has nothing to do with the percentages. Yet his brain and experience had told him that somehow the percentages supported it too, and thus Belichick’s move was the “risky gamble.”

The fourth down debate is significant (though I risk inflating its significance), because it forces you to consider how you actually tackle problems. Indeed, the entire point of probability, statistics, and science generally is to make progress in spite of, not because of or consistent with, our preconceived biases:

This does not mean that one should reject intuition and reflexive feeling. These stances often encapsulate the wisdom of evolution (e.g., aversion to sibling-sibling incest) and/or society (again, aversion to sibling-sibling incest). The totally rational life, where all acts and opinions are subject to deep and thorough criticism, is not the human life (even Karl Popper was more of a theoretical critical rationalist than an operational one judging by his private and personal actions and style of argument). But, serious problems emerge when our intuitive prejudices push themselves into the scientific domain. Natural science has over the past few centuries has proven itself to be a marvel not by extension of our intuition, but contravention of that intuition resulting in an even closer fit to reality (contrast Newtonian physics with “folk physics”). Humans have always had engineering in the form of tinkering with technology. But the last two centuries of productivity growth through mechanical improvements have been based in part on the rise of science as a theoretical framework which allows for more than trial & error experimentation guided by intuition. Science allows us to stand on the shoulders of giants, no matter how bizarre or counterintuitive their theories are, because they are judged not on plausibility but predictivity.

Of course, one of the fascinating things about the brain is it can be trained. I do not think Belichick worked out the numbers as Burke had. Yet he didn’t have to. His intuition was the kind of specialist’s ingrained intuition that came from years of thought about just such issues. Belichick, an economics major, had long thought outside of the box in terms of fourth downs, and we know he is familiar with David Romer’s research on the subject. When presented with the possibility of the fourth down, his intuition, built on three decades of thinking about fourth downs and many, many trials where his team had succeeded and won the game in such circumstances, that he knew the odds. This is the difference between a specialist’s intuition and a layman’s. Yet this is also the point of doing the analysis like I tried to do and Burke and others did: It trains your mind. The more you think in terms of possibilities and potential outcomes, the less you are fooled when some rare or at least relatively unlikely outcome occurs anyway. Ask any poker player.

As a counterpoint, I love the guys at Football Outsiders, but I was generally disappointed with their response to this. They are big “stats guys” in the sense that they track a lot of data and do a lot of good work to try to determine who the best teams, players, coaches, and the like are, but you don’t see a lot of probabalistic/stochastic reasoning in their work. Burke, on the other hand, is big Nassem Taleb fan, and his reaction, like Belichick’s, was to think about the variables at play and to mentally move them around to figure out what really was the best call. Even stats guys can have faulty intuition on these issues. (Barnwell and others eventually went back and crunched some numbers and admitted that it was at least quite plausible that Belichick made the right call, which in my view is an understatement.)

And yet, the unemotional Belichick aside, humans are not machines and do not make decisions like them; and nor should they. The emotional side of the brain — the side with all these crazy biases — is often our only hope for processing huge amounts of information in a limited time span, i.e. the seconds a coach has to make a fourth down call. Think of pilots, soldiers and their commanders during a firefight, lawyers being questioned by judges, doctors during surgery, or all manner of “learned” yet time-sensitive decisionmakers. And there is a human side to many stories. Even in this one, many have justified their rejection of Belichick’s call (mostly on an ex post basis, however) because “what message does that send to your defense.” And maybe there is something to that: Once Manning got the ball around the thirty, with the crowd and the frenzy of the moment, the defense gave up a huge run to Joseph Addai and Manning threw a touchdown pass shortly thereafter. (On the other hand, an already depleted New England defense had been decimated by injury throughout the game, and #18 is a very good quarterback, or so I have heard.)

So we don’t want decisions made only on the measurable evidence, always and forever. But this debate has reinforced a somewhat cynical view of people that I have. There are two basic types: Those who, when confronted with evidence that challenges their instinctive or “gut” reaction, are cynical of their gut, or those who are cynical about the nature of evidence itself. I think over the years of writing this blog I have shown that I am clearly in the former camp. Note that this does not mean you always and forever follow the first evidence that is shown to you: Often we have “gut” feelings for a reason, and some of the best work is done when some support is shown for a proposition that feels wrong, and then people try to figure out why they feel so differently about it. In those cases, the evidence either survives or is even improved (and hopefully some minds are swayed), or the rigorous testing shows that there was some flaw in the evidence. But this view almost always leads to a healthy, respectful debate, and we all learn through the process.

On the other side are those who distrust anything not in their gut. And these people, like Tony Dungy, might have very good instincts. But the result is the dismissal of many good ideas, along with any pretense at debate. “Why is that wrong?” “It just is. I’m telling you.” The sad fact is it is easier to dismiss or ignore arguments (and people) than it is to engage with them or to justify your own views.

Now, whether or not some coach went for it on fourth down is a pretty silly thing to get worked up about. Yet I think the reason people have is that this deep divide — between the instinct sceptics and the evidence sceptics — has become exposed again. To be fair, football is a fair place to leave rationality at the door, and most people, including me, no doubt occasionally operate in the opposing camp depending on the issue. But following the evidence is a lot harder than we usually allow. And for doing that here, Belichick deserves credit. May we all be so bold.