State Space Models

All state space models are written and estimated in the R programming language. The models are available here with instructions and R procedures for manipulating the models here here.

Thursday, October 18, 2012

Using the WL20 Package

The WL20 (W stands for World and L20 stands for Late 20th Century) package is an aggregate, state-space model of the World system. It contains 14 variables thought to be important to measuring the state of the World system, to include population growth, urbanization, commodity markets, carbon emissions, global temperature, biodiversity, globalization and the Ecological Footprint.

Unlike other World models, such as the DICE model, the Wonderland Model, or the System Dynamics World Models, WL20 was not derived from purely theoretical considerations. Rather, it was based on a 1979 paper by R. E. Kalman titled A system-theoretic critique of dynamic economic models. In the paper, Kalman argues that, until complex social systems are better understood, it is important to work from

data -> model

rather than from first-principle, theoretical considerations. In this approach, the primary emphasis is the selection of the data to describe the system followed by the use of state-space models to describe the dynamic behavior of the system. There is some theory used to make the selection of data that I have already discussed (here) and will discuss more fully in a future post. This post will concentrate on the computing details of the model.

The WL20 state-space model was estimated using the dse package (here) and the matlab package (here) in the public domain (it's free) R statistical language (here). Documentation, procedures and the WL20 model are available here.  They should be downloaded to a directory on your computer with the absolute path described by (see below). The remainder of this post describes how to use the WL20 model using the R statistical language.


When you start R on your computer, the R console window will display the version number, copyright information and platform. On my computer:

R version 2.14.2 (2012-02-29)
Copyright (C) 2012 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

A number of  commands that you might find useful will also be displayed



R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

To use the WL20 package,  the dse and matlab packages must also be installed (use the "Package Installer" menu choice and be sure to check the box marked "Install Dependencies" after you get the list of "CRAN (binaries)", you can then select matlab and dse and click "Install Selected"). You can then run R and enter the commands below in the R console window (the is the R prompt). 


W <- "a character string describing the working directory location"

> setwd(W)
> source("LibraryLoad.R")
> load(file="ws_procedures")

> load(file="WL20v3_model")
> WL20.est

The first two commands assign and set the working directory to the file folder where you have downloaded the WL20 package. For example, on my computer, is set to:

W <- "/Users/georgepa/Desktop/R/Papers/WL20 v3/"

The next command will source the dse and matlab packages (if you get error messages, these packages have not been installed properly). The two load commands load the ws (world system) procedures and the model (if you get error messages here, you have probably not specified properly). The final command displays the state-space model matrices for the WL20 model. These matrices are defined in the dse documentation that you can display with the following command:

help(SS)

Data for the WL20 model were gathered from a number of sources, but mostly from the Earth Policy Institute's Data Highlights (here). The complete list of the data sources is available in the documentation (Appendix F, here). You can look at the first few lines of the data set using the following commands:

> head(WL20.data)
          N   OIL  QA GWP P.Wheat. P.Oil.  TEMP   CO2 Carbon
[1,] 2555.0 10.42 631 7.4     1.89   1.71 13.83 311.3   1630
[2,] 2597.5 11.73 655 7.9     2.03   1.71 13.98 311.7   1767
[3,] 2641.2 12.34 680 8.3     1.93   1.92 14.04 312.2   1795
[4,] 2686.2 13.15 705 8.6     1.89   2.01 14.12 312.7   1841
[5,] 2732.5 13.74 730 8.9     1.98   2.11 13.91 313.2   1865
[6,] 2780.0 15.41 759 9.4     1.81   2.11 13.92 313.7   2043
     TotalFootprint    Earths WorldGlobal LivingPlanet    URBAN
[1,]       5.292910 0.4597521    23.37150     1.000000 705.4807
[2,]       5.493554 0.4797747    23.94942     1.002878 730.5622
[3,]       5.694199 0.4997972    24.52735     1.005581 756.4242
[4,]       5.894843 0.5198197    25.10527     1.007935 783.1143
[5,]       6.095488 0.5398423    25.68320     1.009766 810.6524
[6,]       6.296133 0.5598648    26.26112     1.010900 839.0286

The data set runs from 1950 to 2008, which you can verify with the following commands:

start(WL20.data)
[1] 1950    1
> end(WL20.data)
[1] 2008    1

There are commands in the dse package that you can use to evaluate this model. For example:

roots(WL20.est)
[1] 1.0000000+0.000000i 0.9982313+0.027957i 0.9982313-0.027957i
[4] 0.9140820+0.000000i
attr(,"class")
[1] "roots"

displays the eigenvalues of the system matrix. The first eigenvalue (1.0000000+0.000000i) is unity because the model includes a constant. The other eigenvalues are all less than unity and have imaginary parts indicating that the model is stable but cyclical.

You can also check the model fit using:

tfplot(WL20.est)

The results are displayed above. The dotted line displays the fitted values (one step ahead predictions) and the solid lines display the actual state variables.

The WL20 model has three state variables labelled W1, W2 and W3 in the graphic above. The state variables were constructed using Principal Components Analysis (PCA). The measurement model from the PCA is displayed with the following command:

> measurementModel(WL20.index)
               N        OIL          QA         GWP  P.Wheat.
[1,]  0.28446020  0.2711239  0.28207253  0.28243781 0.2362789
[2,] -0.01609311  0.2410390  0.09853503 -0.14131911 0.4113772
[3,] -0.11117409 -0.2412009 -0.08825703  0.02134276 0.4610974
         P.Oil.       TEMP         CO2      Carbon TotalFootprint
[1,] 0.24295530  0.2514421  0.28373276  0.28328608      0.2752374
[2,] 0.06595202 -0.2352280 -0.09832577  0.09252815      0.1816798
[3,] 0.71181845  0.1901141 -0.02009544 -0.09553557     -0.2925719
         Earths WorldGlobal LivingPlanet       URBAN
[1,]  0.2787654  0.28189899  -0.18387870  0.28400812
[2,]  0.1484351 -0.09256627   0.77050207 -0.08681005
[3,] -0.2311710 -0.09431697  -0.05760175 -0.07067596

The rows of this matrix display the weights used to calculate W1, W2 and W3. The weights were constructed from standard scores of the input data. The weight attached to each indicator variable for each state variable are displayed in the columns. By construction, the state variables are independent and uncorrelated.

The pattern of the weights indicated that W1 measures overall secular growth in the World system (this can also be seen from the time plot of W1 above). The second measure is most heavily weighted on the Living Planet Index (0.77050207)wheat prices (0.4113772)oil production (0.2410390) and the Ecological Footprint (number of Earths used by the World System, 0.1484351)The W2 state variable thus measures biodiversity. Finally, the W3 state variable is weighted most heavily on oil prices (0.71181845), wheat prices (0.4610974), Total Ecological Footprint (-0.2925719), and oil production (-0.2412009). The W3 state variable thus measures the effect of commodity markets and the Ecological Footprint.

> WL20.index$measurement$output$fraction.variance
 [1] 0.8740847 0.9413934 0.9725388 0.9864508 0.9949011 0.9980068
 [7] 0.9987371 0.9993400 0.9997902 0.9999405 0.9999802 0.9999923
[13] 0.9999986 1.0000000

Taken together, the World system state variables explain 97% of the variation in the underlying indicators.

One important use of the WL20 model is to forecast future states of the World system. The following commands create and display a 100 year forecast of the state variables:

> f <- forecast(WL20.est,horizon=100)
> tfplot(f)


The results of the forecast, displayed above, indicate that growth in the World system is expected to peak between 2040 and 2050 (W1)  associated with continuing losses of biodiversity (W2) and increasing pressure on commodity markets (W3). You can obtain the entire forecast to use as input to there times series models with the following command:

> fx <- merge.forecast(f)

The WL20 forecast and the time series are already available to you as data files named WL20.f and WL20.fx, respectively. 

I have already used forecasts from the WL20 model to predict global climate change and other indicators (here and here). In future posts, I will make the forecasting models models available as state-space models that can also be loaded into your R workspace and run on your machine.

TECHNICAL NOTE: In the Kalman article (here), emphasis is placed on the importance of observability and controllability in the selection of state space models. For the WL20, these two features can be checked with the following commands:

> observability(WL20.est)
[1] 2.5510918 2.2200285 1.8375940 0.6061376
> reachability(WL20.est)
Singular values of reachability matrix for noise:  2.424642 2.20612 1.628193 1.202821e-16 

The commands display the singular values of certain state-space matrices (use help for more information). If all the singular values are greater than zero, the model is both observable and controllable. The WL20 model is completely observable but might not be totally controllable (the 1.202821e-16 principal value is effectively zero). 

No comments:

Post a Comment