Recently for work I had to add hourly weather information to driving data from 2013 and 2014. I had done this for 2009 to the end of 2013 easily using the Government of Canada Climate website, but at some point in 2014 the website changed how to fetch data. This change broke my old R script and had to determine how to download hourly data from the new site.
Gathering Data
The big change on the new redesigned site was that the download data button is no longer a direct link. Viewing the source, F12 in Chrome, and some digging will show the variables and address to query in order to download the file.
Digging in the source and testing, will find that the query needs: format, year, month, stationid, and timeframe. From this, time frame is set to 1 (hourly), format to CSV, stationid to the weather station, and year/month is the corresponding range. Where now, format and time frame are fixed values for the data to be gathered.
Again from the source the base URL will be: http://climate.weather.gc.ca/climateData/bulkdata_e.html?
in R:
To download the files, the download.file function will be used. This requires a url and output destination, so combining the above into a function:
It is important to note that if a call is outside the bounds of available data or no data, NO error is produced. Instead it will return a file with header information, date, and time but no information about weather.
Reading in data
Now that the files have been saved, to read in the files is easy:
Two things, first 17 days of January 2013 has no data. This can be corrected by using the old station id for the airport. The other issue is that using list.files does not do a natural sort. This can be corrected by ordering the data like below.
Conclusion
Hopefully this provides a way to gather past climate data from Government of Canada’s website. In the future, this may be modified to handle more than one station id at a time inside the function. I used an outer loop over station ids to gather the required information that I needed.
Remember to check your data to make sure it is what you originally intended!