# Weather normalization

According to Portfolio Manager Technical Reference: Climate and Weather, “weather normalized energy is the energy your building would have used under average conditions”. This is a causal problem, rather than a prediction problem, as the “would have” implies actively setting weather condition to the climate normal. Essentially we try to estimate

$E[\text{energy} \mid \text{set weather} = w]$.

Following is a causal graph. There are many variables affecting the energy consumption of a building, but none of them causally affects weather (neglecting global climate change), so there are no confounding variables to adjust for. This makes it viable to estimate the causal effect of weather using prediction techniques.

The causal graph when thinking of weather as a whole

However, “weather” itself is complicated. Temperature is usually thought of as the most representative variable of weather, but other factors such as solar radiation, cloud cover, and wind speed, can cause changes in weather and changes in building energy consumption. From this point of view, if we wanted to know the causal effect of “temperature” (or sometimes degree day) on building energy consumption, solar radiation, cloud cover, wind, etc. becomes confounding factors (see the causal graph below)

The causal graph when thinking of weather as just temperature

From the above reasoning, the following question arises: is it viable to use usually prediction techniques to conduct model and variable selection when constructing the appropriate “weather” variable? i.e. select which of the weather variables to include in the $\vec{\text{weather}}$ when we estimate $E[\text{energy} \mid \text{weather}]$.

Now if we step back and look at the original goal of weather normalization: comparing the “performance” (efficiency?) of energy consumption over time, then this seems to call for some metric describing efficiency. The causal effect of weather on temperature might be one of them. Using this metric, maybe the “base load” (non-weather-dependent consumption) could be evaluated as $E[\text{energy} \mid \text{temperature} = 75F]$.

After discussing with a stats professor, I got the idea that I need to look into doubly robust estimator to estimate the quantity of $E[\text{energy} \mid \text{weather} = w]$, where $w$ is a set of possible weather variables.