Long before research exposed evidence that humans cause global warming, science made another sensational claim — that smoking caused lung cancer. That case has been proven beyond doubt. But there is a science story from this era that is mostly forgotten: The battle against cigarettes taught science how to prove.
Before linking cigarettes to lung cancer, science had no established method to prove that one thing caused another. The fields of epidemiology and statistics were new, and while they had some prior successes, the questions were so evident — think about mercury causing madness — that proof did not require the level of meticulousness that modern science expects. The need to establish a link between cigarettes and lung cancers — and the backlash that ensued — changed this. Epidemiology and cigarettes grew up together.
Today’s debate over global warming echoes that era. Because of politics, a post like this, intended to inform, will sway few minds. But I have spoken with skeptics who honestly want to understand, but don’t have the tools to grapple with such a large, complex field of science. And they have a point — while we talk a lot about the data, we rarely describe the path to a conclusion.
Provoked by their questions, I began to dig. And I unearthed a notion that is rarely mentioned in the global warming debate: Science actually has a method for establishing that one thing causes another. Scientists don’t have to vote on the issue — the 97% consensus of climate scientists who believe that humans cause warming is telling, but only one part of a broader process. And for those who want to honestly weigh their skepticism in context of the evidence, there is a way. Here’s the story.
In the 1950s, Bradford Hill kept a box of cigarettes in his desk at the London School of Hygiene and Tropical Medicine. Professor Hill led the school’s Statistical Research Unit, and like most men of the establishment, he would open his box to respected visitors. This was hardly remarkable, save for one detail: Hill was lead statistician on the British Doctor’s Study. This was one of two large studies that, when published in 1954, led the American Cancer Society to declare that ‘the presently available evidence indicates an association between smoking, particularly cigarette smoking, and lung cancer’.
Between 1930 and 1940, the lung cancer rate among men tripled. Between 1940 and 1950, it doubled again. Between 1950 and 1960, it nearly doubled again. To quote the Surgeon’s General’s famous 1964 report linking smoking to lung cancer, “This extraordinary rise has not been recorded for cancer of any other site.” Yet in the 1950s, even with data against smoking amassing, it was still considered rude not to offer a cigarette.
There was no singular moment when scientists realized that smoking caused lung cancer. Beginning in 1912, when the first suggestion was made, scientists slowly built multiple strands of evidence, refining experiments as they learned. As the case grew in strength, each scientist, looking at the evidence before him (it was almost all men), individually concluded the causal connection was irrefutable.
Hill embarked on the Doctor’s Study because his previous research, performed with his longtime collaborator Professor Richard Doll in 1950, found a substantial correlation between cigarettes and lung cancer in a small patient population. The 1950 link was so striking, in fact, that Doll gave up his cigarettes immediately. Yet Hill himself held on to his pipe until the Doctor’s Study completed in 1954.
And while Hill and Doll and the American Cancer Society were in agreement by 1954, even in the late 1950s high level critics remained, including the esteemed statistician Ronald Fisher, who pronounced himself “extremely skeptical of the claim that decisive evidence has been obtained.” Fisher was not a man to be taken lightly. As a scientist, he has been described by the scientist Richard Dawkins as “the greatest biologist since Darwin”. He provided a mathematical basis for evolutionary theory. He single-handedly created most of modern statistics and the design for the randomized controlled trial, which went on to become the primary tool of medical research. Fisher also smoked, preached libertarian political views, and was an advisor to the Tobacco Manufacturers Standing Committee. He was not happy about the idea that scientists should inject “propaganda” onto an unsuspecting public. Especially because he believed the science was wrong. To illustrate Hill and Doll’s folly, Fisher tore apart their data, highlighting discrepancies between cancer rates in cigarette, cigar, and pipe smokers. He described how much of the increase in cancers could be ascribed to improved methods of detection. And he inaugurated the study of spurious correlations, showing that Doll and Hill’s methods would directly tie an increase in the import of apples to an increase in the divorce rate.
Hill would eventually be proven right. But he needed to develop better tools to show this. And by creating these tools, he would define the rules of proof in epidemiology for the next fifty years.
It is simple to show correlation. But how can one prove causation? This problem is not limited to studies of smoking — it extends through all of science. In fact, if you were to ask scientists outside epidemiology what process they use to “prove” causality, I’d wager that most would either change the subject, or stare back blankly at you. Different scientists evidently maintain different standards for proof. No one is working with a standard process. This is vaguely unsettling.
The problem certainly infects the global warming debate. “Proof” gets thrown around by different people in different way, leaving everyone confused. Politically-driven skeptics leave the term undefined to sandbag the discussion for their own purposes — it’s easier to claim “not enough is known” when you never define “known”. Yet the lack of definition hovers like a fog over anyone trying honestly to parse out the answers for themselves. It’s hard to have faith in something so ill-defined. And this brings us back to Professor Hill.
The battle against smoking was the first bare-knuckles public policy debate driven by science. So over years of defending his work, Hill had to think deeply about what constitutes ‘proof’, and how to overcome the intelligent rebuttals of the world’s Ronald Fishers. In 1965, he formally proposed a solution.
Hill recognized that there are more ways to support causation that finding that two variables track. In fact, Hill identified nine separate strands of ‘proof’, each of which makes an independent case for or against causation. The list of nine aspects — and I’ll go into details below — are now called Hill’s Criteria.
You don’t need strong support from all of the strands to prove a result. But when independent strands tell the same story, with no contradictions, the case is strong. Perhaps as importantly, by using fixed criteria, we can categorize not just data we have, but identify what data are missing as well. And with all of the possible evidence in mind, we can effectively draw a conclusion using classic, human judgment.
Ronald Fisher passed away before Hill published his criteria, so he never had a chance to offer his final verdict. But the field of epidemiology has. Hill’s Criteria have effectively ended the debate over how to analyze cause, and have been used largely unchanged for the last fifty years. Fisher’s contribution was not to prove Hill wrong, but to make Hill’s arguments stronger. While Fisher’s skepticism did much damage to the public (some of whom might have stopped smoking sooner but for his efforts), the battle forced Hill to structure his thinking, to the benefit of all of science.
And while Hill’s Criteria are not commonly used outside epidemiology, they should be. The criteria take an impossibly large and complex pile of data and break them up into chunks. They make the evidence understandable. And they make the case for causality transparent — each piece of evidence is categorized, and weighed in the context of the whole. If evidence is challenged, it becomes clear just how devastating or inconsequential that challenge is. We lose any presumption that somehow a single set of data could prove the entirety of scientific understanding to be in error.
So from here, we go off from the history of cigarettes and heath, and drive to the weeds of global warming. What happens when we apply Hill’s criteria to the question:
Are humans, by adding CO2 to the air, causing the planet to warm?
Hill’s Criterion #1: Strength: How strong is the relationship between CO2 and temperature?
As the old saying goes, “correlation is not causation, but it’s a damn good place to start.” All other things being equal, a strong correlation is more likely to hold up as causal. The correlation between temperature and carbon dioxide concentration over the last thousand years is strong.
This does not look like a coincidence. But knowing that there is a strong correlation is not enough. We do not know if carbon dioxide causes temperature to rise; temperature causes carbon dioxide to rise; or some third, independent factor is causing both to rise. Many, many scientific papers outside climate science offer up a correlation as if it were meaning. Many, many scientific papers have been wrong as a result. To get more insight into this, we need to look deeper.
Criterion #2: Consistency. Is the data consistent across multiple measurements, at multiple places and times?
I harped on consistency a lot in my last blog post. Science should never rely on a single type of measurement, because single measurements can have unexpected flaws. Multiple strands of data are needed to confirm a hypothesis.
When looked at through that lens, how does the above graph hold up? Thermometer records have only existed since the 1850s, and were only recently distributed throughout the globe. As a result, scientists have had to get creative to reconstruct the temperature record, developing proxies such as grape harvest times in Europe, or the compositions of sea shells in the ocean. A 2012 paper collated 173 different measurements, and their average accurately tracked thermometer measurements over the past century, yet extend backwards even further. An example that confirms that the globe is warming is a measure of the growing season in the US, which has on average extended by about two weeks over the last century. It doesn’t match temperature exactly — there is more to farming than temperature — yet has a recent rise that looks familiar.
Again, this data simply confirms that we are not kidding ourselves that the climate is changing, and that this change correlates with CO2. It still does not say why.
Criterion #3: Specificity. Is the change that we are seeing specific to this point in history?
Claiming that humans are causing climate change by burning fossil fuels makes a very specific kind of prediction: You should see nothing like this change at any other point in the Earth’s history. The climate has varied continually throughout the Earth’s geological past through simple patterns such as periodic changes in the Earth’s distance from the sun. By drilling miles deep through the Antarctic frost and measuring the nuclear composition of the ice, scientists can infer the average temperatures during the time the ice was deposited. By measuring air trapped inside the ice, they can also infer carbon dioxide concentrations. The resulting graph looks like this:
You can see a natural cycle of ice ages due to variations in the planet’s orbit. In fact if you believe the graph, sometimes CO2 spiked before the warming, and sometimes the warming started before CO2. In fact, both may be true: There is a feedback loop, and a warming climate releases CO2 from the oceans; the increased CO2 in the air in turn warms the climate more. As one climatologists joked, arguing that one of the two has primacy is like arguing that chickens can’t create eggs, because we have proven conclusively that eggs create chickens.
But that blue line at the end sure is interesting. While there is no question that climate varies naturally, that giant spike of CO2 stands out. It does look specific to today. There is also one other useful piece of information you can glean from this graph. That last increase in the temperature (the red line) is not human-caused warming — it’s the end of the last ice age, the one that allowed humans to cross the Bering Strait from Russia to North America. The x-axis of the graph stretches 400,000 years, so each of these vertical-looking jumps in temperature represents a change that occurs over about 10,000 years, at rates of about 1°C every 1,000 years.
How does that compare with today? Since global temperature records started being kept in 1880, temperatures have risen just under 1°C, with most of that increase happening since 1970.
That’s at least 10X faster than a typical ice age warming. Whatever is going on, it seems eerily specific to the time the world was industrializing. We have identified a cause that is unprecedented in the Earth’s history, and we see a result that is similarly unprecedented. It’s a highly suggestive relationship.
Criterion #4: Temporality. Which came first in modern times, the CO2 or the warming?
This one is should be pretty easy. The hypothesis that human-created CO2 causes climate change yield a simple prediction: emissions should come before warming. But the data looks like this:
What gives? We already know from other data that industrialization caused our rise in CO2, independent of the Earth’s climate. That is accounting, not science. Given that pre-existing knowledge, we don’t have to worry about getting causality backwards. But that doesn’t eliminate the possibility that some unexpected coincidence is causing temperature to rise with CO2, so it’s still helpful to know if one came first.
In this particular case, the simple chart is inconclusive. There was a move to higher temperatures just as industrialization began, and that temperature rise preceded emissions. However, once the burning of fossil fuels for transportation and energy really took off after WWII, the effects of humans became more pronounced. From then, emissions precede warming.
Our natural climate is not perfectly stable on its own, so it’s not surprising that when we were emitting very little CO2, there was a natural uptick or downtick that swamped out the effects of man-made change. Climate scientists point out that once emissions started their exponential climb in the 1950s, emissions clearly precede warming.
Skeptics, meanwhile, point to this graph as something that should make the whole edifice of climate science crumble.
But as I try to highlight in this blog and everywhere else, no single graph will make or break a theory. We use Hill’s checklist to enforce discipline — to make sure we are looking at the problem from all directions, and to highlight places where we should look harder. If a single bullet point doesn’t deliver unequivocal support, that’s ok. Reality is sometimes complicated. As long as it doesn’t unequivocally contradict, the hypothesis should survive.
This bullet point flags a place where we need to look harder. To understand a complex system, you need to build more complex models, and I’ll come back to this again below. But meanwhile, this discussion lead directly to the next important criterion:
Criterion #5: Dose-response. Does the temperature increase scale with CO2 increase?
Smoke more cigarettes, and you are more likely to get lung cancer. This simple relationship — an increased dose yields an increased response — is a hallmark of a causal connection. It’s easy to imagine one experiment going awry. It’s much harder to imagine a series of experiments going awry in a well defined, orderly process.
The link between CO2 and temperature has feedback loops — an increase in temperature will raise atmospheric CO2 levels as the gas moves from ocean to air. So the historical correlations that I have shown above aren’t really relevant here — we know the climate is not so simple. To deconvolute the two, we have to look at data taken in modern times, when we know the CO2 rise has been driven by the burning of fossil fuels.
Now, the modern temperature increase does correlate with CO2, but it’s just one data set. One data set is suggestive, not convincing.
But scientists are nothing if not resourceful. Below is a measurement in the amount of infrared energy reflected (more technically, absorbed and re-emitted) back from the atmosphere to earth — the “Greenhouse Effect”. This measurement isolates wavelengths where CO2 is the sole contributor of the reflection. And lo, the amount of reflected energy not only tracks the long-term trend in CO2, but it also tracks the seasons, as atmospheric CO2 decreases in the spring as plants grow, and increases in the fall when they go dormant. It clearly shows a dose-response relationship.
Pretty damn impressive. That’s an awfully complex relationship to be a coincidence.
Very similar measurements (with less pretty graphs) have been made for outbound radiation as well — we can measure the amount of energy radiated from the Earth using satellites, and find that it has gone down since the 1970s, when the first satellite measurements were made.
Adding CO2 leads to more energy staying on the planet. And that retained energy manifests as heat.
Criterion #6: Plausibility. Does the causal relationship make physical sense?
The idea that the Earth’s atmosphere functions as a sort of insulating blanket was first proposed by French mathematician Joseph Fourier in the 1820s, while he was developing a formal theory of heat flow (one that is still taught to engineering students today). He calculated that, given the Earth’s distance from the sun, the planet should be colder than it actually is. To solve this dilemma, Fourier postulated that the atmosphere traps heat just as a glass wall of a greenhouse does.
In 1859, British physicist Joseph Tindall teased out the degree to which each atmospheric gas should contribute to warming, calculating that CO2 indeed participated. And in 1897, the Swedish chemist Svante Arrhenius published a rough calculation that doubling the amount of CO2 in the Earth would increase its temperature by 5–6°C.
Back in the 1890s, human emissions were so small that it would have taken several hundred years to reach this threshold, so his calculation was seen more as a parlor trick than call to action.
But then came industrialization.
In 1938, the engineer and amateur meteorologist Guy Callendar dug into CO2 records from the 1800s to the (then) present day, and found that atmospheric CO2 had increased by 10%. This was much faster than Arrhenius had anticipated, because industrialization consumed increasing amounts of fossil fuels each year. Based on these measurements, Callendar estimated that warming from humans was already under way.
Of course, none of these scientists would be remembered save that their early guesses, based on insufficient data and absurdly immature models of the climate, turned out to be roughly in line with modern assessments. Callendar was in fact wrong in his estimate that warming in the 1930s was being cause by CO2; modern models find that warming to be primarily from natural causes. History remembers both the good and the lucky.
But if we are trying to assess whether human-induced warming is plausible, then the answer is clearly yes. It has been for generations.
Criterion #7: Coherence. Do the data fit in with current theory and knowledge?
This is where the much-discussed scientific consensus comes in: Of climate science papers that take a position on the issue, 97% support the concept of anthropogenic (human-induced) global warming.
And to be clear, this is not a consensus generated by a monolithic group of nerds. It includes over 10,000 scientists from an astonishing range of sub-disciplines, from computer modeling to atmospheric spectroscopy to paleobiology. They hail from seventy-four different countries. They represent people who get their funding from different sources, publish in different journals, fraternize with different cliques, and generally have nothing in common with each other besides their desire to understand the climate.
Even when you break down the published literature by subfield, over 97% of the literature of each subfield supports the hypothesis of man-made climate.
To wit: A search of the literature of over 12,000 papers containing the phrases “global climate change” or “global warming” shows that only 77 papers reject the hypothesis of human-induced climate change.
That’s a coherent story.
Criterion #8: Experiment. Can we alter, prevent, or improve the situation with an intervention?
The metaphor with epidemiology breaks down slightly here, since it is not possible to take a handful of Earth-like planets and test an intervention on them. We have one Earth, one climate, and one fate. We cannot — yet -intervene to engineer a new, calmer climate.
However, nature has provided us with a recent, natural experiment in climate engineering: The eruption of Mount Pinatubo in 1991. The largest eruption of the last century, the volcano shot rocks over 40 kilometers high, and spewed out about 17 megatons of sulfur dioxide, which rapidly reacted with water in the atmosphere to form aerosols of sulfuric acid. Within weeks, the aerosol plume had spread over the globe, and within a year formed a uniform layer around the atmosphere.
Aerosols have the opposite effect of carbon dioxide — instead of heating the planet by trapping radiation, they reflect the radiation away, cooling temperatures.
Seventeen megatons of material may sound like a lot, but it represents about the volume of Boeing’s Everett Factory, where they assembly 787s. A small amount of material, dispersed evenly through the atmosphere, can have global impact.
Fortunately for us, aerosols disperse quickly, allowing climate to revert to normal in just a few years. With carbon dioxide, by contrast, normalcy will not return for centuries, or even millions of years if species that capture carbon dioxide are rendered extinct by the changes. That last scary bit has happened before.
Criterion #9: Analogy. Is there an analogous, better-understood system that makes the CO2 climate hypothesis plausible?
The idea behind this criterion is that an explanation is more likely to be valid if there is another system that behaves similarly.
The most accessible example is a greenhouse. Another example might be that Venus is hotter than Mercury, even though Mercury is closer to the sun.
Pick your favorite.
Where did the climate models go?
It’s somewhat disconcerting to have a conversation about global warming that doesn’t involve climate models — they are what gets most of the press. Yet we have seen above that you don’t need models to make an effective case that humans are causing climate change.
Models are nonetheless profoundly important, both for understanding our past and for predicting our future. So let’s return to this question we asked above in Temporality: How do we know that the rise in temperature at the end of the 19th century was natural, while subsequent rises are man-made?
To address this, scientists run “experiments” on computers, adjusting their models to consider only natural changes (with no human contribution) to see if they can get a nature-only model to match the data. They fail. When scientists consider only human additions to the climate, with no natural forcings, those models fail too. Only when they combine human and natural contributions to climate do the models fit the data.
The models — and there are a lot of them — all vary slightly from each other. The average model may overestimate warming by a few degrees, or underestimate it. But together, the models are unequivocal in their predictions. The climate will continue to warm. Much faster than it ever has before. With enormous consequences.
Why I like Hill’s criteria.
A magical thing about structure is that it gives you no place to hide. When placed in Hill’s criteria, the strong points and weak points of the argument leap out. You know exactly what data you’d like to gather, if you had the chance. (Go find another planet to test our hypothesis on, for one). If there are holes in the plot of your story, the truth is laid bare for all to see. If you believe alternatives to human-caused global warming, test them in this structure. See if the story holds.
The fact that we rely on stories to judge the legitimacy of an idea may strike some as lacking scientific rigor. So be it. I would love to put a probability behind the declaration that humans are causing climate change. But the world is too complex for us to reduce inquiry to a single number. Part of being a good scientist is to understand your limits.
All scientific work is incomplete, and at risk of being toppled by tomorrow’s discoveries. That does not give us leave from acting today.
The evidence supporting man-made global warming creates the one of strongest science stories I have ever seen.
And this process is how I know.