Seeing as this is a blog about testing hypotheses, I thought that I’d like to discuss a somewhat esoteric problem in doing such tests-namely, how to characterize one’s expectations or “null” hypotheses. The problem may be conceptually thought of this way: Suppose I want to test whether there is a “trend” in data which is sufficiently large that you can place high confidence that it is actually there and not just a fluctuation due to random chance. The problem is how to characterize the probability distribution function of trends due to chance. The simplest kind of assumption is the distribution that would arise out of pure “white noise”-basically every data point in a time series which is white noise will have no dependence on the points before (or after) them, although this is hardly the only thing that white noise is usually characterized by. The thing about white noise is that it is not going to tend to display “trends” very often. When you use white noise as your null hypothesis, rather than asking “how likely is it that this trend arose by chance” you are really asking, “how likely is it that this trend could occur from white noise”. Now, if the random behavior in the system you are looking at is a priori expected to behave like white noise, this is not wrong. But if you really expect the system to behave in a manner which is not like white noise, you are using a null hypothesis which will end up being too easy to reject.
For instance, by generating random (white noise) numbers and giving them the same mean and standard deviation that is found in NCDC’s US national annual precipitation, I estimate that the apparent trend in US precipitation is very nearly three standard deviations from the expected mean of my white noise base trend distribution. In other words the trend that appears to have taken place is highly significantly different from zero assuming that precipitation variability acts like white noise. But is it really the case that the probability of the trend in US precipitation occurring by chance is on the order of .1%? I doubt it. In reality the precipitation data has a correlation with itself of r~0.210 versus the simulated r of ~-0.002. In other words, there is some (albeit very little) dependence of data points in the actual precipitation record on precipitation in the previous year. The white noise synthetic data has no auto-correlation whatsoever (of course) and indeed the very weak r is actually negative(!). This implies that our synthetic data has a slightly lower tendency to exhibit trend behavior than reality. So, really, I don’t think the actual significance of the US precipitation trend is really three sigma.