Archive for November, 2013

Temperature data, Internally Consistent Beliefs, and Data bias

November 25, 2013

This is a long over due post. First, a little background.

There has been a long standing controversy in the climate debate about temperature data from the near surface observing stations, and that inferred from satellite data: According to most climate models, when the surface temperature rises, the temperatures in the atmosphere should on average rise slightly more: this is apparently a consequence of the lapse rate following the moist adiabat. This is a reasonable prediction, in that it appears to be grounded well in physical theory. But satellite data have long shown less warming than one would expect from any significant degree of amplification. This led to quite a bit of wrangling, and adjusting of data, until the present, where the existence of the discrepancy between data and models depends on the dataset chosen-which vary in trends based on arguably subjective methodological differences. My personal preference is for the UAH data, which if accurate suggests the discrepancy is present given accuracy of the surface data. I have numerous reasons for preferring the UAH product (and, I will note, I have continued to prefer it even as RSS has been cooling globally relative to it in recent years, unlike sadly some other skeptics) because of several publications that have been put out by John Christy identifying specific sources of bias in other datasets, and independent work confirming some of them. I would note that some have the attitude that it would be nice to show UAH is wrong. At any rate, I’m using it for this analysis, and as we shall see, I think an interesting argument in favor of UAH can be made on the basis of the analysis I will do.

Okay, so, to begin with: The United States (specifically, the lower 48 states) probably has the densest temperature observing network anywhere on Earth-as such, analysis of US data has a lot to work with and has a good chance to catch and correct biases. I think it is reasonable to expect, a priori, that the US has minimal bias in estimates of surface temperature trends compared to other places in the world. You can get the USHCN data for CONUS averages in absolute monthly temperatures here. Note that one must convert from Farenheit to Celsius for comparison with UAH satellite data (available for various regions in monthly anomalies here, including over the CONUS). But I decided I wanted to test whether satellite data can show any sign of possible non-climatic bias in the US data. This is tricky because temperatures at the surface and the atmosphere are expected to vary differently, and by an unknown factor. Rather than modeling this factor, I attempted to estimate it empirically. I have done something like this before for global means. I get the following results:

Based on linear regression of anomalies against anomalies, a fluctuation of 1 degree in lower troposphere temperatures over the US is generally associated with a fluctuation of ~1.27 degrees in surface temperatures. Regression of detrended anomalies leads to a coefficient of ~1.28. By doing 12 month running averages on anomalies, I get ~1.22 and by doing the same on detrended anomalies, I get about 1.26. Using the highest and lowest values found for coefficients, I “predict” the surface anomalies on the basis of UAH data, I can estimate, assuming the UAH data is correct and that the processes which determine lapse rate variations are independent of timescale, the bias in the surface temperature data over the US. As it turns out these estimates result in (very slight) trend cooling bias. At most, USHCN appears to run about .02 degrees per decade too cool, relative to variance adjusted UAH, at least, about .01. These results are very favorable for USHCN and indicate that there are unlikely to be serious homogeneity problems present in the data given the assumptions we are making. Here is a plot of the two estimates of the bias:


Now, a lot of skeptics might not like this result. If you don’t like this result, I ask you to please hold off on complaining that you don’t like the result just yet. This is just the US, where we have a lot of high quality data: the absence of any warming bias there in the last 30 years doesn’t mean that there is no bias anywhere else. In fact, looking at the global data will be interesting.

To examine this issue globally, it was necessary to make a choice as to which surface temperature dataset would be most comparable to the UAH product for an estimate. Thinking about it, due to their having a large spatial smooth that allows “coverage” of areas with sparse data, GISS is more comparable to UAH which has full coverage over 85N-85S. But GISS extrapolates beyond that range. I can also sidestep the recent questions about HADCRUT4 surrounding a paper attempting to correct it’s coverage bias and finding an underestimate of warming. I am looking into doing a similar analysis of the Cotwan and Way data, which might be interesting. Therefore, I downloaded GISS 1200 km smoothed data from KNMI, masked to the satellite spatial range. Doing the same kind of analysis of this data as I did with the USHCN data, I found for the respective coefficients, ~.79, ~.57, ~.89, and ~.61, respectively. This leads to a range of estimates for the bias in GISS data given our previous assumptions for USHCN bias calcultion, of between ~.04 degrees per decade and ~.08 degrees per decade too much warming. This analysis, which indicates USHCN has a very small cooling trend bias, indicates GISS has a large warming bias. And a plot of the differences suggests, to me, that the larger estimate of the bias may be more accurate: specifically, for the smaller estimate, there is a large dip in the bias close to the 1998 El Niño, suggesting that there are climatic effects in the “bias” estimate, that do not appear to be present in the higher estimate of the bias:


Note the presence of ENSO artifacts in the (blue) low estimate of warming bias in GISS. Note also that the differences, relative to the magnitude of the trends, are no negligible as they were over the US. These results can be seen in these plots of the estimated surface anomalies and the official surface anomalies, with the estimates being those that lead to the red differences above, both smoothed with a 12 month moving average filter (also shown are their linear trends before this is done):


The red curve is GISS, the blue is what I believe is a best estimate UAH based “estimated GISS” without non-climatic biases. The warming trend is cut in half. I repeat, this analysis suggests that half of the surface temperature warming since 1979 does not reflect an addition of actual heat to the climate system, the real trend is lower.

Now, for interpreting these two results. Let’s suppose you agree with the following proposition: That the long term lapse rate variations are governed by the same processes as, and should be the same in proportion, as short term ones. My understanding is that theory and models both suggest this should be so.

Given that, here are some sets of internally consistent beliefs you can hold about what this analysis shows:

If you like USHCN, you should like UAH, if you like UAH, you should like USHCN. If you believe UAH validates the surface temperature adjustments in the US, you have to admit that it invalidates them globally. Any correction to the UAH data to bring it into better agreement with models and GISS would destroy it’s agreement over the US.

Or you can believe neither dataset is accurate.

Now, many skeptics would like to think USHCN has a large warming bias. Well okay, you can believe that if you reject it’s agreement with UAH as a complete coincidence. That would be a consistent (if a little unreasonable) set of beliefs.

Many of the alarmed would like to think that the USHCN is accurate and that UAH is wrong. This belief is inconsistent. Many of them would also like to think that the USHCN data are accurate and the global near surface temperature data are equally accurate. This belief is consistent as long as it entails a belief that the UAH data coincidentally agrees with the USHCN data but is otherwise completely randomly wrong. But the belief also appears to require that satellite analyses with more warming are more accurate. This is inconsistent: Do the same analysis as above for yourself with RSS, and I believe you will find the agreement with USHCN is terrible. This actually provides an interesting argument why UAH is probably better: it agrees well with the best surface temperature data.

Your only alternative to these positions is to reject the idea that short term lapse rate variations are governed by different processes than long term ones. But, over the US, where long term surface changes would presumably lead to significant changes in long term boundary layer/lower troposphere coupling, there is no evidence for this: the lapse rate variations are basically exactly proportionate on all observed timescales…as long as UAH is correct and USHCN is correct. Which seems reasonable since again them agreeing so well by chance seems unlikely if they aren’t both correct. This may be true elsewhere and just not in the US for unknown reasons: the temperature trends at the surface could still be real. But, that would entail mechanisms not currently included in present climate models, involving boundary layer dynamics, and those trends would not accurately reflect a gain of heat from greenhouse warming.

So while everyone else is focused on whether or not the “pause” can be eliminated by extrapolating data over areas where we have no surface observations, using satellite data, I am interested in the question instead, how much temperature trend over the area of satellite observations is actually a reflection of accumulating heat? The answer is a lot less than the amount that has been measured. I would have to check, but I am pretty sure this would, even with polar extrapolation, significantly lengthen the “pause” and increase disagreement with models. That the trend in surface temperatures related to heat accumulation is so small, suggests drastically reduced climate sensitivity relative to all studies that use the surface data. This includes several recent papers giving “lukewarm” sensitivity estimates.

Well, anyway, food for thought.

Guess That Graph!

November 22, 2013

So here’s something fun, and I promise you I am just doing this because I am trying to incite a big argument to attract traffic because things are so dull here and don’t intend to make a habit of the thing I am not telling you what I’m doing.

Let’s play a game. It’s called Guess That Graph! I’m going to display a graph, unlabeled and uncaptioned (yeah, I know, not that different from what I usually do 😉 )  and I won’t say what it depicts (okay one hint: it’s not a climate graph). Let’s see if people can guess what it actually shows. I’m going somewhere with this, I promise, but I don’t know that I am going to want to do this sort of thing on this blog in the future. Hm, maybe another blog…Anyway:


So. Guess that graph!

Can you isolate a volcanic temperature signal in the temperature data?

November 7, 2013

So I have left you all in suspense as to what my big project is. This is it.

The answer appears to be yes, with difficulty. Using data from here, I identified the points at which several spikes in aerosol optical depth occurred globally since 1850. Specifically, I picked the seven largest spikes that were not before the decay of a previous eruption apparently ended. This is what the time variations of those eruptions optical depths look like relative to one another:


Figure 1: AOD profiles of seven eruptions.

The red curve is the “average” eruption profile. The start dates were cross checked when possible against dates of known eruptions, but the date before the first sudden jump in AOD was chosen even if it preceded the eruption, as long as it did not do so by more than twelve months. For example, the start date for one of the above curves of December 1882 very roughly corresponds to Krakatoa-that is, the AOD increases suddenly in January of 1883. This is the beginning of the spike associated with Krakatoa…except that doesn’t quite work. There appears to have been some build-up from some other eruption or eruptions before that, since Krakatoa did not itself erupt until August of 1883. I chose to leave the start date as the month before the increase relative to the background began rather than the date of the eruption. In another case, there was a spike beginning after September of 1855, which I can’t seem to identify the associated eruption for. Sato et al. identify the spike with Cotopaxi, for what it’s worth. So these are my chosen “start” (baseline) dates:

September 1855 (Cotopaxi, others in 1856?)

December 1882 (Krakatoa, possibly some smaller eruptions earlier?)

December 1901 (Santa María, again this is months before the eruption proper, but I have chosen when the increase in Optical depth began.)

May 1912 (Novarupta, or Katmai-which seem to refer to the same volcano but different parts of it. The timing here is perfect, with the eruption happening in June, directly associated with when the spike in optical depth begins. From this point onward, I figure the dates are more reliable)

March 1963 (Agung-there was some increase in AOD from 1960 to 1961 but it essentially leveled of, I selected this date on the basis that after it the AOD jumps rapidly, and it happens to perfectly align with the timing of the eruption)

March 1982 (El Chicón, perfect timing again.)

May 1991 (Pinatubo, perfect timing again.)

Now, given these start dates, the next problem to solve becomes, how to isolate the temperature signal? The temperature record (I used HADCRUT4) contains long term trends in an apparent pattern which needs to be removed first to isolate the short term effects of volcanic eruptions from long term trends. I have previous worked on techniques to smooth short term variations out of data and identify long term signals. The technique goes like this:

First: How long is the time series? Let’s say it is n months long. At the time I originally did this that was 1958 months for the temperature data.

Second: Is n odd or even? If odd, subtract one and divide by two, if even subtract two and divided by two; call this number m-you will need it later so write it down.

Third: Take a three point centered average of the time series, such that you create a time series n-2 months long that starts at the second month and ends on the n-1 month. For those “missing” months, take an average like this: (1st month+1st month+2nd month)/3 and (nth month+nth month+n-1th month)/3 for the first and last month of the first smoothed timeseries, respectively.

Fourth: Repeat the third step m times (including the first time) treating the k-1th result as the original timeseries in the kth repetition.

EDIT: To clarify the above point, you repeat step three m times, that is, until k = m. So if you are doing this in Excel, with each smoothing column acting on the previous (that is, the kth column acting on the k-1th column) you should have m total columns (not counting the original data column 0r a column for time).

This ends up looking like this:


Figure 2: HADCRUT4 and smoothed variations, in K.

I was not satisfied with this as having removed the dips from volcanoes-the dips from Pinatubo and El Chicón for example appear to still be present. So I then repeating the smoothing process 9 more times, until the long term signal looked like this:


Figure 3: Further smoothing of HADCRUT4

The original smooth is shown for comparison. Separating this out from the monthly data levels thus short term “noise”:


Figure 4: Short term variability of temperature, in K

Obviously, various global “weather” is present here: ENSO etc. The data is noisy. But if we take segments of the data beginning at the dates we picked for the beginning of spikes in AOD, we get something like this:


Figure 5: Temperature profiles after 7 volcanic eruptions, and the average profile.

The red line is the average temperature evolution after the volcanic eruptions. Obviously the noise in the data makes it difficult to see the signal, but averaging seems to help. Here is a plot of just the average response:


Figure 6: Average temperature profile after a volcanic eruption.

We begin to see that, not long after a volcanic eruption, the Earth’s surface temperature begins to dip somewhat, but not all that much-an average increase of about .08 in AOD seems to be associated with an average drop in temperature of less than -.15 K. We can quantify this a little better, though-but we need to determine the exact date of the minimum and estimate dip at that point that is not just the remaining noise at monthly timescales. So I use the above smoothing method on the average volcanic temperature profile (n=110 months, m=54). The result looks like this:


Figure 7: Same as 6, but with smooth variations.

Note I am using this not to estimate the magnitude of the temperature dip just yet-since the smoothing probably attenuates the magnitude of temperature swings-but I am identifying the date of the temperature minimum. It appears that the minimum temperature occurs at about 25 months. Using the same smoothing technique on the AOD profile so that any bias in the date will be present in both datasets-and because there are actually two different dates of maximum in the average AOD profile-I identify the minimum in the smoothed AOD:


Figure 8: The average and smoothed AOD profile following an eruption.

As occurring at 16 months after the start of the AOD spike. This amounts to a “lag” of nine months, which is only slightly longer than others seem to have found.

Now as for estimating the impact: First, I want to restore the variance loss to the temperature data. I do this by removing the means from both the raw average temperature profile and the smoothed profile, and then doing a regression where the smoothed profile is the predictor of the raw temperature profile. This yields a coefficient of approximately 1.43. I multiply the smoothed profile by that coefficient:


Figure 9: 7, shifted to mean of zero.

I then subtract the initial value of the smoothed series from both:


Figure 10: Same as 9, shifted down to start at zero.

The minimum temperature dip is about -.11 K. The change in AOD from start to maximum is about .08. So I can estimate, linearly, using a 9 month lag and the coefficient of about -1.38, the temperature impact of volcanoes. That looks like this:


Figure 11: Linearly estimated temperature responses to volcanic eruptions since 1850, in K.

Note that the largest volcanic eruption dip, from Krakatoa, is less than -.23 K. No wonder the Wikipedia page says “citation needed” to the outlandish claim that temperature dipped by as much as 1.2 K! There is just no such evidence in the temperature data-no such large reduction in temperature could even possibly have occurred.

To see what this looks like I subtracted the above from the temperature data and then smoothed both the original and the “volcanoes linearly removed” series with 12 month running averages:


Figure 12: HADCRUT4 with volcanoes removed, compared to without, Annually smoothed.

And smoothed data:


Figure 13: 12, with described smoothing method instead of annual.

As can be seen from the above, this technique, simply linearly estimating the impact of volcanic eruptions, removes much of the signal: you can clearly see how well it removed Pinatubo from the data. But can we go further? Can we estimate the climate’s sensitivity from this data? Hm, maybe! Here is my first shot:

First, recall Equation 7 here. If we have a forcing function and a temperature function, and we can take the derivative of the temperature function, then we can use the temperature function and it’s derivative as predictor variables in a multiple regression analysis to attempt to predict the forcing function. Their coefficients will be an estimate of the inverse of the sensitivity and the time constant divided by the sensitivity, respectively. It will be easier to work with the smoothed temperature profile response than the temperatures themselves, since that is easier to estimate the derivative of. But the forcing function is just a little trickier since in order to make it directly comparable to the smoothed temperatures, it has to also be smoothed, but start at zero. So the first thing I did was take the smoothed data from Figure 8, and fit an exponential decay from the initial value to the value the smooth takes in the 61st month. I then subtracted that from the smoothed data. Let’s call that series S-E (smoothed minus exponential). The next thing I did was take the raw AOD profile from Figure 8, and subtract the initial value (that is, I made it start at zero)-let’s call that R-B (raw minus baseline). I then calculated the maximum value of R-B and of S-E and multiplied S-E by the factor that would make those values equal. Let’s called that MC-S-E (Magnitude corrected S-E) and subtracted the baseline value from the smoothed data from Figure 8-called it S-B. I then take the first 28 months of MC-S-E and the 29th month on of S-B, and combined them. The final eruption smoothed AOD profile, compared to R-B, looks like this:


Figure 14: Smooth AOD profile after an eruption.

The next step is to convert AOD units into radiative forcing. To get this I take the GISS forcing data, and take annual average values of AOD, and regress those values against the GISS Stratospheric Aerosol forcing. The coefficient I get is about -23.45, meaning a change of positive one tenth of a unit AOD leads to a negative forcing of -2.3 W/m^2 (compare that to 3.7 W/m^2 for a doubling of CO2) clearly large, but transient forcings come about as a result of volcanic eruptions (on average peaking out a little under -2 W/m^2).

Calculating the derivative of the temperature (dT/dt) is trickier. I’ve taken the first differences, and made a second copy of them and shifted them back one month, then averaged the two, and place zero values at the beginning and end of the dT/dt series.

Finally, doing a multiple regression of T and dT/dt onto F(t), I get the following for coefficients: inverse lambda of about 7.4, 95% confidence interval from about 5.54 to 9.26, which corresponds to a sensitivity of .500 K for a doubling of CO2, with a range from about .399 to .667 K for a doubling of CO2, and tau times inverse lambda of about 28.52 from 14.59 to 42.46, corresponding to 3.85 months, ranging from 2.63 to 4.85 months, although the fit is pretty poor due to high noise level (adjusted R squared of about .409). This provides evidence that the sensitivity is pretty small, and is in the range of my estimates from feedback fluxes, and the Faint Young Sun.

EDIT: I wondered if maybe I would get a different result if I focused on the first 25 months-that is I essentially forced the model to focus on the initial temperature drop and ignore the recovery period. It perhaps makes sense to do this because of the weird wave pattern that may be an artifact of…something. Anyway, fitting only the first 25 months results in a much better match of the initial temperature dip and large errors after that. The original 110 months is just the time between El Chicón and Pinatubo, there was nothing magically significant about try to fit nine years after an eruption. Getting the initial dip right could be seen as more important. So only doing the multiple regression on the first 25 months, I get even lower sensitives and longer response times (best fit of .369 for a doubling of CO2 and 13.67 months). The fit for those 25 months is better (adjusted R squared of about .991) but the fit over the whole dataset with those parameters is terrible. It’s interesting that to get a good fit to the initial dip required a lower sensitivity-lower than I personally believe makes sense, especially given other analyses I have done. It indicates to me that there is, realistically, not going to be a good explanation for this data with sensitivities that require even slight positive feedback.