In this project, we will attempt to predict the temperature in Knoxville, TN using the ts and forecast libraries.

First we’ll read in recent temperature data for Knoxville, TN. This was obtained for free from the National Centers for Environmental Information website. We’ll use data from 1/1/2010 to 5/30/2017. For simplicity we’ll just focus on daily maximum temperatures from one weather station. We’ll also use lubridate to turn the DATE column into a genuine date column.

Temperatures <- read_csv("983224.csv") %>%
  filter(STATION_NAME == "KNOXVILLE MCGHEE TYSON AIRPORT TN US") %>%
  select(DATE, TMAX) %>% mutate(DATE = ymd(DATE)) %>%
  filter(DATE >= "2010-1-1", DATE < "2017-5-31")

Next we’ll plot the last decade of temperature data. The yearly cycle is clearly visible.

Temperatures %>% ggplot(aes(x=DATE, y=TMAX)) + geom_line() + labs(y = "daily high temperature")


TBATS

Now we’ll use TBATS to construct our model. This will account for annual seasonality as well as weekly seasonality.

TempTS <- msts(Temperatures %>% select(-DATE), seasonal.periods=c(7,365.25), start = 2010)
fit <- tbats(TempTS)
fc <- forecast(fit)
plot(fc)

This definitely captured the annual seasonality.

Now we’ll print the predictions for the next couple days.

Predictions <- as.data.frame(fc$mean)
names(Predictions) <- "TMAX"

Predictions <- Predictions %>%
  mutate(DATE = ymd(20170530) + days(row_number())) %>%
  select(DATE, TMAX)

Predictions %>% head(10) %>% kable()
DATE TMAX
2017-05-31 83.84814
2017-06-01 84.14353
2017-06-02 84.56931
2017-06-03 85.26205
2017-06-04 86.08152
2017-06-05 86.69087
2017-06-06 86.85399
2017-06-07 86.63751
2017-06-08 86.37396
2017-06-09 86.41873

The highs were 81 on 5/31, 81 on 6/1, and 84 on 6/2. So at least these results are semi-reasonable.