Why Combining Stations Creates too much Error
In the right video I made at the top I got into the problems that arise when one station is used to fill in the gaps for another station. This post will expand on that to expose the problem more.
To know if one can use one location’s data to fill in the gaps of another, the first thing you have to do is to check how the temps differ between each other where there are matching records.
So see what happens I picked to close stations, Belleville Ontario (4442) and Ottawa (4333) and matched the days together and subtracted Belleville’s TMax, TMean and TMin for the summer months (May to Sept). Belleville is just south west of Ottawa, about a 2 hour drive.
The difference in the Tmax between these two locations is:
Ottawa was as much as 11.7C warmer than Belleville on the same day. Belleville was as much as 15C warmer on a different day, with the average difference of -0.18C and a standard deviation of 2.55C.
The difference in the Tmean between these two locations is:
Ottawa was as much as 8.1C warmer than Belleville on the same day. Belleville was as much as 11.7C warmer on a different day, with the average difference of -0.63C and a standard deviation of 2.20C.
The difference in the TMin between these two locations is:
Ottawa was as much as 12.8C warmer than Belleville on the same day. Belleville was as much as 13.9C warmer on a different day, with the average difference of -1.09C and a standard deviation of 2.20C.
Using just one year, 1974 (it has a large swing in the difference), this is what the summer TMax looks like. Even part of the season shows asymmetry in the difference.
This is the same year for TMin:
Notice that Belleville has warmer nights than Ottawa. This is because Belleville is on Lake Ontario, Ottawa isn’t. The lake is buffering the temps for Belleville keeping their nights warmer than Ottawa.
So if you wanted to use Ottawa to fill in Belleville’s missing data, the difference can be as much as 10C with a 65% range of +/-2.2C, a 97% range of +/-4.4C. But it gets more complicated than that. It appears the difference between the two locations is changing over time.
This is the difference in the same day TMax between the two locations per year. The top red line is when Ottawa is hotter than Belleville, the middle black line is the average difference of TMax and the blue line is when Belleville is hotter than Ottawa. Notice over time Belleville’s hotter daytime TMax is dropping, less difference, compared to Ottawa.
This is the same graph showing the difference in the TMin temp, summer nighttime temps. Notice over time the two locations are showing less of a difference in temps between them.
So this begs the question. How does one merge records together from different locations to fill gaps when the degree of differences changes with time? You can’t without introducing a high degree of error calculated from the standard deviations of matching records, which would have to change as the years progress.
Bottom line is using station data from different locations to fill in gaps is nothing more than guesses at best, and at worst is creating data ex nihilo.