The Bellman Equation is a fundamental concept in dynamic programming and optimal control theory. It provides a recursive decomposition of the value function, which represents the maximum value that can be achieved from a given state. It has some very good properties, compared with the Lagrangian method, the “Old fashioned” way.
- You don’t need to calculate entire path period by period.
- You can solve the problem backwards, which is very useful in many cases.
subject to
Application : Consumption-Saving Problem:
Assume that income is deterministic and constant .
The interesting part here is the doesn’t denote the derivative, but the next period’s variable.
Stochastic Dynamic Programming
Recall the sequence formulation of household problem:
Thus we could have the recursive formulation for Bellman equation:
subject to and
The policy functions here are and .
What is a policy function? Here the policy isn’t the government policy, but the decision rule that tells us how much to consume/save given the current state (wealth level).
We could use a cash-on-hand method to simplify it: we let , then is a state variable which denotes your wealth received at the beginning of the period.
The case-on-hand strategy is kind of trick to make the computer programming easier. Because we don’t need to track both and , but only .
Then, one nice thing that could happen is we could use to denote the savings, thus .
One thing important here is we have to assume is i.i.d. Because only it is i.i.d., the previous value doesn’t provide any information to , so we don’t need to track it in the state variable. In this setting, we only care about !
Stochastic Bellman Equation
Thus our Bellman Equation becomes:
We could then use Largrangian Multiplier method to solve it:
Taking the FOC, we get:
From Envelope Theorem, we have:
In the next period, it becomes:
Thus we could use this to substitute back to the FOC:
and
So the Euler Equation is easy to get from here!
Because
Markov Process: How does it work?
An important property the Markov Process has is the Memoryless property, which means that the future state only depends on the current state, not the entire history.
Some settings:
- : Finite number of income realizations
- =
For example,
The transition matrix
To fully understand it, we raise a concrete numerical example:
Since it is a setting, we also have income realizations that that .
There are two ways to think about here. One is that the probability you would have income in period . The other way is that you could think about there are three different types of households, type 1 always has income , type 2 always has income , and type 3 always has income . And the you get is the proportion of each type in the economy.
How to read this matrix?
- Step1: Read the rows. Each row represents your current income state.
- Step2: Read the columns. Each column represents your next period income state.
- Step3: Read the value. Each value represents the probability of moving from current income state to next period income state.
For example, in , it means if your current income is , then the probability that your next period income is still is .
Thus
This numerical example also shows income persistence if you look at the diagonal values, which are relatively large.
The stationary distribution
The stationary distribution tells: in the long run, what fraction of time do you spend in each income state?
How it solves:
Written out for our 3-state example:
This gives three equations:
Also a constraint:
There are four equations with three unknowns, we could solve it to get the exact values of .
The stationary distribution tells us the long-run fraction of time the process spends in each state. For example, if , it means that in the long run, the process will be in state half of the time. Or 50% of households would have income in the long run.
How does it connect with Bellman Equation?
With stochastic income, the Bellman equation becomes:
Euler Equation is:
Finding the Stationary Distribution
You could find the python code example to find the stationary distribution of a Markov process
Code Example
# Value Function Iteration with IID Income
# Greg Kaplan 2017
# Translated by Tom Sweeney Dec 2020
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import fsolve
from discrete_normal import discrete_normal
# PARAMETERS
## preferences
risk_aver = 2
beta = 0.95
## returns
r = 0.03
R = 1+r
## income risk: discretized N(mu,sigma^2)
mu_y = 1
sd_y = 0.2
ny = 5
## asset grids
na = 500
amax = 20
borrow_lim = 0
agrid_par = 1 # 1 for linear, 0 for L-shaped
## computation
max_iter = 1000
tol_iter = 1.0e-6
Nsim = 50000
Tsim = 500
# OPTIONS
Display = 1
DoSimulate = 1
MakePlots = 1
# DRAW RANDOM NUMBERS
np.random.seed(2020)
yrand = np.random.rand(Nsim,Tsim)
# SET UP GRIDS
## assets
agrid = np.linspace(0,1,na).reshape(na,1)
agrid = agrid**(1/agrid_par)
agrid = borrow_lim + (amax-borrow_lim)*agrid
## income: disretize normal distribution
width = fsolve(lambda x: discrete_normal(ny,mu_y,sd_y,x)[0],2)
temp, ygrid, ydist = discrete_normal(ny,mu_y,sd_y,width)
ycumdist = np.cumsum(ydist)
# UTILITY FUNCTION
if risk_aver==1:
u = lambda c: np.log(c)
else:
u = lambda c: (c**(1-risk_aver)-1)/(1-risk_aver)
# INITIALIZE VALUE FUNCTION
Vguess = np.zeros((na,ny))
for iy in range(0,ny):
Vguess[:,iy] = u(r*agrid[0]+ygrid[iy])/(1-beta)
### Vguess = np.ones((na,ny))
# ITERATE ON VALUE FUNCTION
V = Vguess.copy()
Vdiff = 1
Iter = 0
while Iter <= max_iter and Vdiff > tol_iter:
Iter = Iter + 1
Vlast = V.copy()
V = np.zeros((na,ny))
sav = np.zeros((na,ny))
savind = np.zeros((na,ny), dtype=int)
con = np.zeros((na,ny))
## loop over assets
for ia in range(0,na):
## loop over income
for iy in range(0,ny):
cash = R*agrid[ia] + ygrid[iy]
Vchoice = u(np.maximum(cash-agrid,1.0e-10)) + beta*(Vlast @ ydist)
V[ia,iy] = np.max(Vchoice)
savind[ia,iy] = np.argmax(Vchoice)
sav[ia,iy] = agrid[savind[ia,iy]]
con[ia,iy] = cash - sav[ia,iy]
Vdiff = np.max(abs(V-Vlast))
if Display >= 1:
print('Iteration no. ' + str(Iter), ' max val fn diff is ' + str(Vdiff))
# SIMULATE
if DoSimulate == 1:
yindsim = np.zeros((Nsim,Tsim), dtype=int)
aindsim = np.zeros((Nsim,Tsim), dtype=int)
## initial assets
aindsim[:,0] = 0
## loop over time periods
for it in range(0,Tsim):
if Display >= 1 and (it+1)%100 == 0:
print(' Simulating, time period ' + str(it+1))
### income realization: note we vectorize simulations at once because
### of matlab, in other languages we would loop over individuals
yindsim[yrand[:,it]<=ycumdist[0],it] = 0
for iy in range(1,ny):
yindsim[np.logical_and(yrand[:,it]>ycumdist[iy-1], yrand[:,it]<=ycumdist[iy]),it] = iy
## asset choice
if it < Tsim-1:
for iy in range(0,ny):
aindsim[yindsim[:,it]==iy,it+1] = savind[aindsim[yindsim[:,it]==iy,it],iy]
## assign actual asset and income values
asim = agrid[aindsim]
ysim = ygrid[yindsim]
# MAKE PLOTS
if MakePlots==1:
## consumption policy function
plt.plot(agrid,con[:,0],'b-',label = 'Lowest income state')
plt.plot(agrid,con[:,ny-1],'r-', label = 'Highest income state')
plt.grid()
plt.xlim((0,amax))
### plt.title('Consumption Policy Function')
plt.title('Consumption')
plt.legend()
plt.show()
## savings policy function
plt.plot(agrid,sav[:,0]-agrid[:,0],'b-')
plt.plot(agrid,sav[:,ny-1]-agrid[:,0],'r-')
plt.plot(agrid,np.zeros((na,1)),'k',linewidth=0.5)
plt.grid()
plt.xlim((0,amax))
### plt.title('Savings Policy Function (a''-a)')
plt.title('Savings')
plt.show()
## nice zoom
xlimits = (0,1)
xlimind = np.ones(na, dtype=bool)
if np.min(agrid) < xlimits[0]:
xlimind = np.logical_and(xlimind,(agrid[:,0]>=np.max(agrid[agrid<xlimits[0]])))
elif np.min(agrid) > xlimits[1]:
xlimind = 0
if np.max(agrid) > xlimits[1]:
xlimind = np.logical_and(xlimind,(agrid[:,0]<=np.min(agrid[agrid>xlimits[1]])))
elif np.max(agrid) < xlimits[0]:
xlimind = 0
## consumption policy function: zoomed in
plt.plot(agrid[xlimind],con[xlimind,0],'b-o',linewidth=2)
plt.plot(agrid[xlimind],con[xlimind,ny-1],'r-o',linewidth=2)
plt.grid()
plt.xlim(xlimits)
plt.title('Consumption: Zoomed')
plt.show()
## savings policy function: zoomed in
plt.plot(agrid[xlimind],sav[xlimind,0]-agrid[xlimind,0],'b-o',linewidth=2)
plt.plot(agrid[xlimind],sav[xlimind,ny-1]-agrid[xlimind,0],'r-o',linewidth=2)
plt.plot(agrid,np.zeros((na,1)),'k',linewidth =0.5)
plt.grid()
plt.xlim(xlimits)
plt.title('Savings: Zoomed (a\'-a)')
plt.show()
## income distribution
plt.hist(ysim[:,Tsim-1],len(ygrid),facecolor=(0,0.5,0.5),edgecolor='blue')
plt.ylabel('')
plt.title('Income distribution')
plt.show()
## asset distribution
plt.hist(asim[:,Tsim-1],40,facecolor=(.7,.7,.7),edgecolor='black')
plt.ylabel('')
plt.title('Asset distribution')
plt.show()
## convergence check
plt.plot(range(0,Tsim),np.mean(asim,0),'k-',linewidth=1.5)
plt.xlabel('Time Period')
plt.title('Mean Asset Convergence')
plt.show()
## asset distribution statistics
aysim = asim[:,Tsim-1]/np.mean(ysim[:,Tsim-1])
print('Mean assets: ' + str(np.mean(aysim)))
print('Fraction borrowing constrained: ' + str(np.sum(aysim==borrow_lim)/Nsim * 100) + '%')
print('10th Percentile: ' + str(np.quantile(aysim,.1)))
print('50th Percentile: ' + str(np.quantile(aysim,.5)))
print('90th Percentile: ' + str(np.quantile(aysim,.9)))
print('99th Percentile: ' + str(np.quantile(aysim,.99)))Different Methods
to be done.
To-do List
- Practice in Growth Model
- What is policy variable?
- How to understand ?