State Space Models

All state space models are written and estimated in the R programming language. The models are available here with instructions and R procedures for manipulating the models here here.

Friday, October 19, 2012

Peak Oil Forecast and Global Warming

Peak Oil is the point where the rate of petroleum extraction starts declining because the resource is being exhausted.  US domestic Peak Oil production was reached in 1970. World oil production may have peaked in 2011, but it is too early to establish that as fact. In this post, I will forecast World oil production using the WL20 model to see whether the model thinks the World system has reached Peak Oil.

If you've "peaked" at the graphic above you probably can determine that the answer will be "Yes"! However, there's much more at stake here than the simple conclusion, however controversial, that we have reached Peak Oil.

After seeing my Global Warming forecast (here), one of my readers wondered whether anyone had combined Peak Oil models and Global Warming models. He reasoned, correctly, that the  WL20 model was capable of exploring the link between Peak Oil and Global Warming. This post will explore that relationship.

The underlying theoretical model linking Peak Oil and Global Warming is pretty simple: (Oil Production) -> (CO2 Emissions) -> (Global Warming). You might disagree with this linear causal model, but assume for the moment that it is correct. Then, anything that reduces oil production, like Peak Oil, will reduce Global Warming. My Global Warming forecast (here) shows Global temperature peaking sometime between 2040 and 2060. My Peak Oil Forecast above shows that oil production has reached its peak and is likely to collapse entirely around 2040. The result would seem to confirm the simple theoretical model, but how are these two forecasts related within the WL20 model?

The WL20 model model is a state-space model with three state variables (these state variables were not imposed on the model a priori but were the result of the statistical analysis): the first state variable measures overall growth in the World system; the second state variable measures declining biodiversity; and, the third state variable measures increasing resource constraints in the commodity markets related to the Ecological Footprint. The three state variables are interrelated: increasing biodiversity is related to declining global temperature while  increased resource extraction through commodity markets and overall economic growth are related to positive increases in global temperature.

The early peak in oil production is just one of a number of negative feedback loops within the model. The negative feedback loops limit overall growth in the World system around 2040 (see the WL20 state-variable forecast here). Global temperature takes a few more decades to peak after that, but it is really the end of overall growth, not just Peak Oil, that eventually limits global temperature growth, at least in the WL20 model.

There seem to be very few studies that have pursued the link between Peak Oil and Global Warming possibly because there are many alternative, high-carbon sources of energy (tar sands and synfuel from coal being two examples) that could be substituted for oil. Others, such as Amory Lovins (here) have argued that "Efficiency is cheaper than fuel" and will, for economic reasons, eventually limit emissions along with cheaper green energy. The WL20 model is capable of making projections of economic efficiency, a topic I will have to return to in a future post. The difficulty with the "substitution" argument is the important extent to which the entire World system is built on the oil economy. Even though we switched from a coal- to an oil-based economy in the 20th Century, it's not clear that the next energy conversion will be that easy given the larger scale of the present World system.

`
The prediction of Peak Oil was initially made by M. King Hubbert, a Shell geoscientist who died in 1989. The Hubbert curve or Hubbert peak for the World system is displayed above (from this source). His forecast, based on logistic curve modeling, predicted that the peak in World oil production would occur in the year 2000. It serves as a warning that no forecasting model can really see into the future. The models are simply attempts to explore the future implications of the data and models available when the forecast was made.

Thursday, October 18, 2012

Using the WL20 Package

The WL20 (W stands for World and L20 stands for Late 20th Century) package is an aggregate, state-space model of the World system. It contains 14 variables thought to be important to measuring the state of the World system, to include population growth, urbanization, commodity markets, carbon emissions, global temperature, biodiversity, globalization and the Ecological Footprint.

Unlike other World models, such as the DICE model, the Wonderland Model, or the System Dynamics World Models, WL20 was not derived from purely theoretical considerations. Rather, it was based on a 1979 paper by R. E. Kalman titled A system-theoretic critique of dynamic economic models. In the paper, Kalman argues that, until complex social systems are better understood, it is important to work from

data -> model

rather than from first-principle, theoretical considerations. In this approach, the primary emphasis is the selection of the data to describe the system followed by the use of state-space models to describe the dynamic behavior of the system. There is some theory used to make the selection of data that I have already discussed (here) and will discuss more fully in a future post. This post will concentrate on the computing details of the model.

The WL20 state-space model was estimated using the dse package (here) and the matlab package (here) in the public domain (it's free) R statistical language (here). Documentation, procedures and the WL20 model are available here.  They should be downloaded to a directory on your computer with the absolute path described by (see below). The remainder of this post describes how to use the WL20 model using the R statistical language.


When you start R on your computer, the R console window will display the version number, copyright information and platform. On my computer:

R version 2.14.2 (2012-02-29)
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

A number of  commands that you might find useful will also be displayed



R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

To use the WL20 package,  the dse and matlab packages must also be installed (use the "Package Installer" menu choice and be sure to check the box marked "Install Dependencies" after you get the list of "CRAN (binaries)", you can then select matlab and dse and click "Install Selected"). You can then run R and enter the commands below in the R console window (the is the R prompt). 


W <- "a character string describing the working directory location"

> setwd(W)
> source("LibraryLoad.R")
> load(file="ws_procedures")

> load(file="WL20v3_model")
> WL20.est

The first two commands assign and set the working directory to the file folder where you have downloaded the WL20 package. For example, on my computer, is set to:

W <- "/Users/georgepa/Desktop/R/Papers/WL20 v3/"

The next command will source the dse and matlab packages (if you get error messages, these packages have not been installed properly). The two load commands load the ws (world system) procedures and the model (if you get error messages here, you have probably not specified properly). The final command displays the state-space model matrices for the WL20 model. These matrices are defined in the dse documentation that you can display with the following command:

help(SS)

Data for the WL20 model were gathered from a number of sources, but mostly from the Earth Policy Institute's Data Highlights (here). The complete list of the data sources is available in the documentation (Appendix F, here). You can look at the first few lines of the data set using the following commands:

> head(WL20.data)
          N   OIL  QA GWP P.Wheat. P.Oil.  TEMP   CO2 Carbon
[1,] 2555.0 10.42 631 7.4     1.89   1.71 13.83 311.3   1630
[2,] 2597.5 11.73 655 7.9     2.03   1.71 13.98 311.7   1767
[3,] 2641.2 12.34 680 8.3     1.93   1.92 14.04 312.2   1795
[4,] 2686.2 13.15 705 8.6     1.89   2.01 14.12 312.7   1841
[5,] 2732.5 13.74 730 8.9     1.98   2.11 13.91 313.2   1865
[6,] 2780.0 15.41 759 9.4     1.81   2.11 13.92 313.7   2043
     TotalFootprint    Earths WorldGlobal LivingPlanet    URBAN
[1,]       5.292910 0.4597521    23.37150     1.000000 705.4807
[2,]       5.493554 0.4797747    23.94942     1.002878 730.5622
[3,]       5.694199 0.4997972    24.52735     1.005581 756.4242
[4,]       5.894843 0.5198197    25.10527     1.007935 783.1143
[5,]       6.095488 0.5398423    25.68320     1.009766 810.6524
[6,]       6.296133 0.5598648    26.26112     1.010900 839.0286

The data set runs from 1950 to 2008, which you can verify with the following commands:

start(WL20.data)
[1] 1950    1
> end(WL20.data)
[1] 2008    1

There are commands in the dse package that you can use to evaluate this model. For example:

roots(WL20.est)
[1] 1.0000000+0.000000i 0.9982313+0.027957i 0.9982313-0.027957i
[4] 0.9140820+0.000000i
attr(,"class")
[1] "roots"

displays the eigenvalues of the system matrix. The first eigenvalue (1.0000000+0.000000i) is unity because the model includes a constant. The other eigenvalues are all less than unity and have imaginary parts indicating that the model is stable but cyclical.

You can also check the model fit using:

tfplot(WL20.est)

The results are displayed above. The dotted line displays the fitted values (one step ahead predictions) and the solid lines display the actual state variables.

The WL20 model has three state variables labelled W1, W2 and W3 in the graphic above. The state variables were constructed using Principal Components Analysis (PCA). The measurement model from the PCA is displayed with the following command:

> measurementModel(WL20.index)
               N        OIL          QA         GWP  P.Wheat.
[1,]  0.28446020  0.2711239  0.28207253  0.28243781 0.2362789
[2,] -0.01609311  0.2410390  0.09853503 -0.14131911 0.4113772
[3,] -0.11117409 -0.2412009 -0.08825703  0.02134276 0.4610974
         P.Oil.       TEMP         CO2      Carbon TotalFootprint
[1,] 0.24295530  0.2514421  0.28373276  0.28328608      0.2752374
[2,] 0.06595202 -0.2352280 -0.09832577  0.09252815      0.1816798
[3,] 0.71181845  0.1901141 -0.02009544 -0.09553557     -0.2925719
         Earths WorldGlobal LivingPlanet       URBAN
[1,]  0.2787654  0.28189899  -0.18387870  0.28400812
[2,]  0.1484351 -0.09256627   0.77050207 -0.08681005
[3,] -0.2311710 -0.09431697  -0.05760175 -0.07067596

The rows of this matrix display the weights used to calculate W1, W2 and W3. The weights were constructed from standard scores of the input data. The weight attached to each indicator variable for each state variable are displayed in the columns. By construction, the state variables are independent and uncorrelated.

The pattern of the weights indicated that W1 measures overall secular growth in the World system (this can also be seen from the time plot of W1 above). The second measure is most heavily weighted on the Living Planet Index (0.77050207)wheat prices (0.4113772)oil production (0.2410390) and the Ecological Footprint (number of Earths used by the World System, 0.1484351)The W2 state variable thus measures biodiversity. Finally, the W3 state variable is weighted most heavily on oil prices (0.71181845), wheat prices (0.4610974), Total Ecological Footprint (-0.2925719), and oil production (-0.2412009). The W3 state variable thus measures the effect of commodity markets and the Ecological Footprint.

> WL20.index$measurement$output$fraction.variance
 [1] 0.8740847 0.9413934 0.9725388 0.9864508 0.9949011 0.9980068
 [7] 0.9987371 0.9993400 0.9997902 0.9999405 0.9999802 0.9999923
[13] 0.9999986 1.0000000

Taken together, the World system state variables explain 97% of the variation in the underlying indicators.

One important use of the WL20 model is to forecast future states of the World system. The following commands create and display a 100 year forecast of the state variables:

> f <- forecast(WL20.est,horizon=100)
> tfplot(f)


The results of the forecast, displayed above, indicate that growth in the World system is expected to peak between 2040 and 2050 (W1)  associated with continuing losses of biodiversity (W2) and increasing pressure on commodity markets (W3). You can obtain the entire forecast to use as input to there times series models with the following command:

> fx <- merge.forecast(f)

The WL20 forecast and the time series are already available to you as data files named WL20.f and WL20.fx, respectively. 

I have already used forecasts from the WL20 model to predict global climate change and other indicators (here and here). In future posts, I will make the forecasting models models available as state-space models that can also be loaded into your R workspace and run on your machine.

TECHNICAL NOTE: In the Kalman article (here), emphasis is placed on the importance of observability and controllability in the selection of state space models. For the WL20, these two features can be checked with the following commands:

> observability(WL20.est)
[1] 2.5510918 2.2200285 1.8375940 0.6061376
> reachability(WL20.est)
Singular values of reachability matrix for noise:  2.424642 2.20612 1.628193 1.202821e-16 

The commands display the singular values of certain state-space matrices (use help for more information). If all the singular values are greater than zero, the model is both observable and controllable. The WL20 model is completely observable but might not be totally controllable (the 1.202821e-16 principal value is effectively zero).